-
Notifications
You must be signed in to change notification settings - Fork 6.2k
fix error: 'snprintf' was not declared in this scope #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Alexey Lapitsky <lex@realisticgroup.com>
liewegas
added a commit
that referenced
this pull request
Oct 28, 2011
fix error: 'snprintf' was not declared in this scope
liewegas
pushed a commit
that referenced
this pull request
Nov 9, 2011
These were static in auth/Crypto.cc, which was mostly fine, except when we got a signal shutting everything down for the gcov stuff, like so: Thread 21 (Thread 2164): #0 0x00007f31a800b3cd in open64 () from /lib/libpthread.so.0 #1 0x000000000081dee0 in __gcov_open () #2 0x000000000081e3fd in gcov_exit () #3 0x00007f31a67e64f2 in exit () from /lib/libc.so.6 #4 0x000000000054e1ca in handle_signal (signal=<value optimized out>) at osd/OSD.cc:600 #5 <signal handler called> #6 0x00007f31a8007a9a in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 #7 0x0000000000636d7b in Wait (this=0x2241000) at ./common/Cond.h:48 #8 SimpleMessenger::wait (this=0x2241000) at msg/SimpleMessenger.cc:2637 #9 0x00000000004a4e35 in main (argc=<value optimized out>, argv=<value optimized out>) at ceph_osd.cc:343 and a racing thread would, say, accept a connection and then crash, like so: #0 0x00007f31a800ba0b in raise () from /lib/libpthread.so.0 #1 0x0000000000696eeb in reraise_fatal (signum=2164) at global/signal_handler.cc:59 #2 0x00000000006976cc in handle_fatal_signal (signum=<value optimized out>) at global/signal_handler.cc:106 #3 <signal handler called> #4 0x00007f31a67e0ba5 in raise () from /lib/libc.so.6 #5 0x00007f31a67e46b0 in abort () from /lib/libc.so.6 #6 0x00007f31a70846bd in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/libstdc++.so.6 #7 0x00007f31a7082906 in ?? () from /usr/lib/libstdc++.so.6 #8 0x00007f31a7082933 in std::terminate() () from /usr/lib/libstdc++.so.6 #9 0x00007f31a708328f in __cxa_pure_virtual () from /usr/lib/libstdc++.so.6 #10 0x0000000000690e5b in CryptoKey::decrypt (this=0x7f3195a67510, in=..., out=..., error=...) at auth/Crypto.cc:404 #11 0x000000000079ccee in void decode_decrypt_enc_bl<CephXServiceTicketInfo>(CephXServiceTicketInfo&, CryptoKey, ceph::buffer::list&, std::basic_string<char, std::char_traits<char>, std::allocator<char> >&) () #12 0x0000000000795ca3 in cephx_verify_authorizer (cct=0x2232000, keys=<value optimized out>, indata=..., ticket_info=<value optimized out>, reply_bl=<value optimized out>) at auth/cephx/CephxProtocol.cc:438 #13 0x00000000007a17cf in CephxAuthorizeHandler::verify_authorizer (this=<value optimized out>, cct=0x2232000, keys=0x2256000, authorizer_data=<value optimized out>, authorizer_reply=..., entity_name=..., global_id=@0x7f3195a67848, caps_info=..., auid=0x7f3195a67840) at auth/cephx/CephxAuthorizeHandler.cc:21 #14 0x00000000005577ff in OSD::ms_verify_authorizer (this=0x2267000, con=0x230da00, peer_type=<value optimized out>, protocol=<value optimized out>, authorizer_data=<value optimized out>, authorizer_reply=<value optimized out>, isvalid=@0x7f3195a67c0f) at osd/OSD.cc:2723 #15 0x0000000000611ce1 in ms_deliver_verify_authorizer (this=<value optimized out>, con=0x230da00, peer_type=4, protocol=2, authorizer=<value optimized out>, authorizer_reply=<value optimized out>, isvalid=@0x7f3195a67c0f) at msg/Messenger.h:145 #16 SimpleMessenger::verify_authorizer (this=<value optimized out>, con=0x230da00, peer_type=4, protocol=2, authorizer=<value optimized out>, authorizer_reply=<value optimized out>, isvalid=@0x7f3195a67c0f) at msg/SimpleMessenger.cc:2419 #17 0x00000000006309ab in SimpleMessenger::Pipe::accept (this=0x22ce280) at msg/SimpleMessenger.cc:756 #18 0x0000000000634711 in SimpleMessenger::Pipe::reader (this=0x22ce280) at msg/SimpleMessenger.cc:1546 #19 0x00000000004a7085 in SimpleMessenger::Pipe::Reader::entry (this=<value optimized out>) at msg/SimpleMessenger.h:208 #20 0x000000000060f252 in Thread::_entry_func (arg=0x874) at common/Thread.cc:42 #21 0x00007f31a8003971 in start_thread () from /lib/libpthread.so.0 #22 0x00007f31a689392d in clone () from /lib/libc.so.6 #23 0x0000000000000000 in ?? () Instead, put these on the heap. Set them up in the ceph::crypto::init() method, and tear them down in ceph::crypto::shutdown(). Fixes: #1633 Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
liewegas
added a commit
that referenced
this pull request
Nov 23, 2011
Shut down MonClient before messenger, to avoid race with MonClient::tick() and MonClient::shutdown(). Fixes #0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136 #1 0x00007f44475e2849 in _L_lock_953 () from /lib/libpthread.so.0 #2 0x00007f44475e266b in __pthread_mutex_lock (mutex=0x14d8dc8) at pthread_mutex_lock.c:61 #3 0x00000000005ae090 in Mutex::Lock (this=0x14d8db8, no_lockdep=false) at ./common/Mutex.h:108 #4 0x000000000068440e in MonClient::shutdown (this=0x14d8c30) at mon/MonClient.cc:386 #5 0x00000000005b2653 in ceph_tool_common_shutdown (ctx=0x14d84c0) at tools/common.cc:661 #6 0x00000000005ada29 in main (argc=7, argv=0x7fff8a2394c8) at tools/ceph.cc:304 vs #0 0x00007f44475e8a0b in raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:42 #1 0x00000000005eff6b in reraise_fatal (signum=11) at global/signal_handler.cc:59 #2 0x00000000005f0165 in handle_fatal_signal (signum=11) at global/signal_handler.cc:106 #3 <signal handler called> #4 0x0000000000000000 in ?? () #5 0x000000000068661a in MonClient::tick (this=0x14d8c30) at mon/MonClient.cc:621 #6 0x0000000000689e3b in MonClient::C_Tick::finish(int) () #7 0x000000000061b3c5 in SafeTimer::timer_thread (this=0x14d8df8) at common/Timer.cc:102 #8 0x000000000061c6f0 in SafeTimerThread::entry() () #9 0x00000000005f1219 in Thread::_entry_func (arg=0x14e1a00) at common/Thread.cc:41 #10 0x00007f44475e0971 in start_thread (arg=<value optimized out>) at pthread_create.c:304 #11 0x00007f4445ead92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #12 0x0000000000000000 in ?? () Signed-off-by: Sage Weil <sage@newdream.net>
liewegas
pushed a commit
that referenced
this pull request
Nov 18, 2012
Before the mon, and lockdep, in particular. #0 __pthread_mutex_lock (mutex=0x30) at pthread_mutex_lock.c:50 #1 0x0000000000816092 in ceph::log::Log::submit_entry (this=0x0, e=0x2f4a270) at log/Log.cc:138 #2 0x00000000007ee0f8 in handle_fatal_signal (signum=11) at global/signal_handler.cc:100 #3 <signal handler called> #4 0x00000000008e1300 in lockdep_will_lock (name=0x959aa7 "SignalHandler::lock", id=17) at common/lockdep.cc:163 #5 0x00000000008867fc in Mutex::_will_lock (this=0x2f20428) at ./common/Mutex.h:56 #6 0x0000000000886605 in Mutex::Lock (this=0x2f20428, no_lockdep=false) at common/Mutex.cc:81 #7 0x00000000007eeb95 in SignalHandler::entry (this=0x2f20300) at global/signal_handler.cc:198 #8 0x00000000008b0bd1 in Thread::_entry_func (arg=0x2f20300) at common/Thread.cc:43 #9 0x00007f36fefd6b50 in start_thread (arg=<optimized out>) at pthread_create.c:304 #10 0x00007f36fd80b6dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #11 0x0000000000000000 in ?? () #0 0x00007f36fefd7e75 in pthread_join (threadid=139874129766144, thread_return=0x0) at pthread_join.c:89 #1 0x00000000008b11ec in Thread::join (this=0x2f20300, prval=0x0) at common/Thread.cc:130 #2 0x00000000007eeae7 in SignalHandler::shutdown (this=0x2f20300) at global/signal_handler.cc:186 #3 0x00000000007ee9cf in SignalHandler::~SignalHandler (this=0x2f20300, __in_chrg=<optimized out>) at global/signal_handler.cc:175 #4 0x00000000007eea58 in SignalHandler::~SignalHandler (this=0x2f20300, __in_chrg=<optimized out>) at global/signal_handler.cc:176 #5 0x00000000007ee643 in shutdown_async_signal_handler () at global/signal_handler.cc:324 #6 0x00000000006de9d2 in main (argc=7, argv=0x7fffbfb8a1e8) at ceph_mon.cc:439 Signed-off-by: Sage Weil <sage@inktank.com>
liewegas
added a commit
that referenced
this pull request
Sep 15, 2014
The C_CancelOp path assumes op->session != NULL. Cancel that op before we clear it. This fixes a crash like #0 pthread_rwlock_wrlock () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_wrlock.S:39 #1 0x00007fc82690a4b1 in RWLock::get_write (this=0x18, lockdep=<optimized out>) at ./common/RWLock.h:88 #2 0x00007fc8268f4d79 in Objecter::op_cancel (this=0x1f61830, s=0x0, tid=0, r=-110) at osdc/Objecter.cc:1850 #3 0x00007fc8268ba449 in Context::complete (this=0x1f68c20, r=<optimized out>) at ./include/Context.h:64 #4 0x00007fc8269769aa in RWTimer::timer_thread (this=0x1f61950) at common/Timer.cc:268 #5 0x00007fc82697a85d in RWTimerThread::entry (this=<optimized out>) at common/Timer.cc:200 #6 0x00007fc82651ce9a in start_thread (arg=0x7fc7e3fff700) at pthread_create.c:308 Signed-off-by: Sage Weil <sage@redhat.com>
branch-predictor
pushed a commit
to branch-predictor/ceph
that referenced
this pull request
Nov 5, 2015
kv/LevelDBStore, FileStore, MonDBStore: simpler code for single-key fetches
chamdoo
pushed a commit
to chamdoo/ceph
that referenced
this pull request
Nov 13, 2015
…ocks. Summary: SizeBeingCompacted was called without any lock protection. This causes crashes, especially when running db_bench with value_size=128K. The fix is to compute SizeUnderCompaction while holding the mutex and passing in these values into the call to Finalize. (gdb) where ceph#4 leveldb::VersionSet::SizeBeingCompacted (this=this@entry=0x7f0b490931c0, level=level@entry=4) at db/version_set.cc:1827 ceph#5 0x000000000043a3c8 in leveldb::VersionSet::Finalize (this=this@entry=0x7f0b490931c0, v=v@entry=0x7f0b3b86b480) at db/version_set.cc:1420 ceph#6 0x00000000004418d1 in leveldb::VersionSet::LogAndApply (this=0x7f0b490931c0, edit=0x7f0b3dc8c200, mu=0x7f0b490835b0, new_descriptor_log=<optimized out>) at db/version_set.cc:1016 ceph#7 0x00000000004222b2 in leveldb::DBImpl::InstallCompactionResults (this=this@entry=0x7f0b49083400, compact=compact@entry=0x7f0b2b8330f0) at db/db_impl.cc:1473 ceph#8 0x0000000000426027 in leveldb::DBImpl::DoCompactionWork (this=this@entry=0x7f0b49083400, compact=compact@entry=0x7f0b2b8330f0) at db/db_impl.cc:1757 ceph#9 0x0000000000426690 in leveldb::DBImpl::BackgroundCompaction (this=this@entry=0x7f0b49083400, madeProgress=madeProgress@entry=0x7f0b41bf2d1e, deletion_state=...) at db/db_impl.cc:1268 ceph#10 0x0000000000428f42 in leveldb::DBImpl::BackgroundCall (this=0x7f0b49083400) at db/db_impl.cc:1170 ceph#11 0x000000000045348e in BGThread (this=0x7f0b49023100) at util/env_posix.cc:941 ceph#12 leveldb::(anonymous namespace)::PosixEnv::BGThreadWrapper (arg=0x7f0b49023100) at util/env_posix.cc:874 ceph#13 0x00007f0b4a7cf10d in start_thread (arg=0x7f0b41bf3700) at pthread_create.c:301 ceph#14 0x00007f0b49b4b11d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Test Plan: make check I am running db_bench with a value size of 128K to see if the segfault is fixed. Reviewers: MarkCallaghan, sheki, emayanke Reviewed By: sheki CC: leveldb Differential Revision: https://reviews.facebook.net/D9279
chamdoo
pushed a commit
to chamdoo/ceph
that referenced
this pull request
Nov 13, 2015
Summary:
Now this gives us the real deal stack trace:
Assertion failed: (false), function GetProperty, file db/db_impl.cc, line 4072.
Received signal 6 (Abort trap: 6)
#0 0x7fff57ce39b9
ceph#1 abort (in libsystem_c.dylib) + 125
ceph#2 basename (in libsystem_c.dylib) + 0
ceph#3 rocksdb::DBImpl::GetProperty(rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >*) (in db_test) (db_impl.cc:4072)
ceph#4 rocksdb::_Test_Empty::_Run() (in db_test) (testharness.h:68)
ceph#5 rocksdb::_Test_Empty::_RunIt() (in db_test) (db_test.cc:1005)
ceph#6 rocksdb::test::RunAllTests() (in db_test) (testharness.cc:60)
ceph#7 main (in db_test) (db_test.cc:6697)
ceph#8 start (in libdyld.dylib) + 1
Test Plan: added artificial assert, saw great stack trace
Reviewers: haobo, dhruba, ljin
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D18309
chamdoo
pushed a commit
to chamdoo/ceph
that referenced
this pull request
Nov 13, 2015
fix compile error on 32-bits Reviewed-by: Sage Weil <sage@redhat.com>
abhidixit
pushed a commit
to abhidixit/ceph
that referenced
this pull request
Feb 23, 2016
Hammer iam non mock1
smithfarm
added a commit
that referenced
this pull request
Mar 10, 2016
Attempts to install jewel ceph-common, ceph-mon, ceph-osd, and ceph-base package over infernalis ceph package fail due to files existing in both. See comment #4 in the tracker issue for a deeper analysis. http://tracker.ceph.com/issues/15047 Fixes: #15047 Signed-off-by: Nathan Cutler <ncutler@suse.com>
ktdreyer
pushed a commit
that referenced
this pull request
Mar 11, 2016
Attempts to install jewel ceph-common, ceph-mon, ceph-osd, and ceph-base package over infernalis ceph package fail due to files existing in both. See comment #4 in the tracker issue for a deeper analysis. http://tracker.ceph.com/issues/15047 Fixes: #15047 Signed-off-by: Nathan Cutler <ncutler@suse.com>
alimaredia
added a commit
to alimaredia/ceph
that referenced
this pull request
May 24, 2016
Signed-off-by: Ali Maredia <amaredia@redhat.com>
mattbenjamin
added a commit
to linuxbox2/ceph
that referenced
this pull request
Aug 31, 2016
This implements option ceph#4 for external boost, based on upstream discussion. In option ceph#4: 1. boost is added as a submodule 2. builds default to using the attached boost module 3. building against a system-provided boost is supported, but must be configured explicitly Because all of the boost components are attached as nested submodules in the upstream boost repository, neither the nested submodules nor the root boost submodule have been cloned into modules in github.com/ceph (acked by Sage). Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
mattbenjamin
added a commit
to linuxbox2/ceph
that referenced
this pull request
Sep 1, 2016
This implements option ceph#4 for external boost, based on upstream discussion. In option ceph#4: 1. boost is added as a submodule 2. builds default to using the attached boost module 3. building against a system-provided boost is supported, but must be configured explicitly Because all of the boost components are attached as nested submodules in the upstream boost repository, neither the nested submodules nor the root boost submodule have been cloned into modules in github.com/ceph (acked by Sage). Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
mattbenjamin
added a commit
to linuxbox2/ceph
that referenced
this pull request
Sep 26, 2016
This implements option ceph#4 for external boost, based on upstream discussion. In option ceph#4: 1. boost is added as a submodule 2. builds default to using the attached boost module 3. building against a system-provided boost is supported, but must be configured explicitly Because all of the boost components are attached as nested submodules in the upstream boost repository, neither the nested submodules nor the root boost submodule have been cloned into modules in github.com/ceph (acked by Sage). Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
liewegas
pushed a commit
that referenced
this pull request
Sep 29, 2016
This implements option #4 for external boost, based on upstream discussion. In option #4: 1. boost is added as a submodule 2. builds default to using the attached boost module 3. building against a system-provided boost is supported, but must be configured explicitly Because all of the boost components are attached as nested submodules in the upstream boost repository, neither the nested submodules nor the root boost submodule have been cloned into modules in github.com/ceph (acked by Sage). Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
liewegas
pushed a commit
that referenced
this pull request
Sep 29, 2016
This implements option #4 for external boost, based on upstream discussion. In option #4: 1. boost is added as a submodule 2. builds default to using the attached boost module 3. building against a system-provided boost is supported, but must be configured explicitly Because all of the boost components are attached as nested submodules in the upstream boost repository, neither the nested submodules nor the root boost submodule have been cloned into modules in github.com/ceph (acked by Sage). Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
liewegas
pushed a commit
that referenced
this pull request
Oct 4, 2016
This implements option #4 for external boost, based on upstream discussion. In option #4: 1. boost is added as a submodule 2. builds default to using the attached boost module 3. building against a system-provided boost is supported, but must be configured explicitly Because all of the boost components are attached as nested submodules in the upstream boost repository, neither the nested submodules nor the root boost submodule have been cloned into modules in github.com/ceph (acked by Sage). Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
liewegas
pushed a commit
that referenced
this pull request
Oct 18, 2016
This implements option #4 for external boost, based on upstream discussion. In option #4: 1. boost is added as a submodule 2. builds default to using the attached boost module 3. building against a system-provided boost is supported, but must be configured explicitly Because all of the boost components are attached as nested submodules in the upstream boost repository, neither the nested submodules nor the root boost submodule have been cloned into modules in github.com/ceph (acked by Sage). Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
liewegas
pushed a commit
that referenced
this pull request
Oct 21, 2016
This implements option #4 for external boost, based on upstream discussion. In option #4: 1. boost is added as a submodule 2. builds default to using the attached boost module 3. building against a system-provided boost is supported, but must be configured explicitly Because all of the boost components are attached as nested submodules in the upstream boost repository, neither the nested submodules nor the root boost submodule have been cloned into modules in github.com/ceph (acked by Sage). Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
runsisi
pushed a commit
to runsisi/ceph
that referenced
this pull request
Oct 24, 2016
…er instance the caller needs to check the nullity of the parameter before calling PK11_FreeSymKey or PK11_FreeSlot, otherwise if CryptoAESKeyHandler::init failed, we will hit a segfault as follows: #0 0x00007f76844f5a95 in PK11_FreeSymKey () from /lib64/libnss3.so ceph#1 0x00007f76586b6e49 in CryptoAESKeyHandler::~CryptoAESKeyHandler() () from /lib64/librados.so.2 ceph#2 0x00007f76586b5eea in CryptoAES::get_key_handler(ceph::buffer::ptr const&, std::string&) () from /lib64/librados.so.2 ceph#3 0x00007f76586b4b9c in CryptoKey::_set_secret(int, ceph::buffer::ptr const&) () from /lib64/librados.so.2 ceph#4 0x00007f76586b4e95 in CryptoKey::decode(ceph::buffer::list::iterator&) () from /lib64/librados.so.2 ceph#5 0x00007f76586b7ee6 in KeyRing::set_modifier(char const*, char const*, EntityName&, std::map<std::string, ceph::buffer::list, std::less<std::string>, std::allocator<std::pair<std::string const, ceph::buffer::list> > >&) () from /lib64/librados.so.2 ceph#6 0x00007f76586b8882 in KeyRing::decode_plaintext(ceph::buffer::list::iterator&) () from /lib64/librados.so.2 ceph#7 0x00007f76586b9803 in KeyRing::decode(ceph::buffer::list::iterator&) () from /lib64/librados.so.2 ceph#8 0x00007f76586b9a1f in KeyRing::load(CephContext*, std::string const&) () from /lib64/librados.so.2 ceph#9 0x00007f76586ba04b in KeyRing::from_ceph_context(CephContext*) () from /lib64/librados.so.2 ceph#10 0x00007f765852d0cd in MonClient::init() () from /lib64/librados.so.2 ceph#11 0x00007f76583c15f5 in librados::RadosClient::connect() () from /lib64/librados.so.2 ceph#12 0x00007f765838cb1c in rados_connect () from /lib64/librados.so.2 ... Signed-off-by: runsisi <runsisi@zte.com.cn>
runsisi
pushed a commit
to runsisi/ceph
that referenced
this pull request
Oct 26, 2016
we have to shutdown the hunting timer and reset cur_con at the same place, or the hunting timer may set cur_con before it get shutdown, which results segfault as follows: #0 0x00007fb09ffca989 in raise () from /lib64/libc.so.6 ceph#1 0x00007fb09ffcc098 in abort () from /lib64/libc.so.6 ceph#2 0x00007fb08ea52677 in ceph::__ceph_assert_fail(char const*, char const*, int, char const*) () from /lib64/librados.so.2 ceph#3 0x00007fb08e93144c in ceph::log::SubsystemMap::should_gather(unsigned int, int) [clone .part.120] () from /lib64/librados.so.2 ceph#4 0x00007fb08e97eb15 in RefCountedObject::put() () from /lib64/librados.so.2 ceph#5 0x00007fb08eae3f9e in MonClient::~MonClient() () from /lib64/librados.so.2 ceph#6 0x00007fb08e97c2d5 in librados::RadosClient::~RadosClient() () from /lib64/librados.so.2 ceph#7 0x00007fb08e97c319 in librados::RadosClient::~RadosClient() () from /lib64/librados.so.2 ceph#8 0x00007fb08e94684a in rados_shutdown () from /lib64/librados.so.2 ceph#9 0x00007fb098074210 in __pyx_pw_5rados_5Rados_7shutdown () from /usr/lib64/python2.7/site-packages/rados.so ... Signed-off-by: runsisi <runsisi@zte.com.cn>
tchaikov
added a commit
that referenced
this pull request
May 18, 2025
Previously, in __ceph_abort and related abort handlers, we allocated
ClibBackTrace instances using raw pointers without proper cleanup. Since
these handlers terminate execution, the leaks didn't affect production
systems but were correctly flagged by ASan during testing:
```
Direct leak of 288 byte(s) in 1 object(s) allocated from:
#0 0x55aefe8cb65d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_ceph_assert+0x1f465d) (BuildId: a4faeddac80b0d81062bd53ede3388c0c10680bc)
#1 0x7f3b84da988d in ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...) /home/jenkins-build/build/workspace/ceph-pull-requests/src/common/assert.cc:157:21
#2 0x55aefe8cf04b in supressed_assertf_line22() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/ceph_assert.cc:22:3
#3 0x55aefe8ce4e6 in CephAssertDeathTest_ceph_assert_supresssions_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/ceph_assert.cc:31:3
#4 0x55aefe99135d in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2653:10
#5 0x55aefe94f015 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2689:14
...
```
This commit resolves the issue by using std::unique_ptr to manage the
lifecycle of backtrace objects, ensuring proper cleanup even in
non-returning functions. While these leaks had no practical impact in
production (as the process terminates anyway), fixing them improves code
quality and eliminates false positives in memory analysis tools.
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
tchaikov
added a commit
that referenced
this pull request
May 22, 2025
…ms in destructor
Fix memory leak in test_not_before_queue identified by AddressSanitizer.
Previously, the test was terminating without properly dequeueing all elements,
causing resource leaks during teardown.
This change implements a proper cleanup mechanism that:
1. Advances the queue's time until all elements become eligible for processing
2. Dequeues all remaining elements to ensure proper destruction
3. Guarantees clean teardown even for elements with future timestamps
The fix eliminates ASan-reported leaks occurring in the not_before_queue_t::enqueue
method when test cases were torn down prematurely.
A sample of the error from ASan:
```
Direct leak of 1800 byte(s) in 15 object(s) allocated from:
#0 0x7f71b1f1ab5b in operator new(unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:86
#1 0x55fb4c977058 in void not_before_queue_t<tv_t, test_time_t>::enqueue<tv_t const&>(tv_t const&) /home/kefu/dev/ceph/src/common/not_before_queue.h:164
#2 0x55fb4c97748b in NotBeforeTest::load_test_data(std::vector<tv_t, std::allocator<tv_t> > const&) /home/kefu/dev/ceph/src/test/test_not_before_queue.cc:67
#3 0x55fb4c961112 in NotBeforeTest_RemoveIfByClass_no_cond_Test::TestBody() /home/kefu/dev/ceph/src/test/test_not_before_queue.cc:213
#4 0x55fb4ca05f67 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2653
#5 0x55fb4ca1c4f7 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2689
#6 0x55fb4c9e1104 in testing::Test::Run() /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2728
#7 0x55fb4c9e16e2 in testing::TestInfo::Run() /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2874
#8 0x55fb4c9e73b4 in testing::TestSuite::Run() /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:3052
#9 0x55fb4c9f059b in testing::internal::UnitTestImpl::RunAllTests() /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:6004
#10 0x55fb4ca064ff in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2653
#11 0x55fb4ca1d1bf in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2689
#12 0x55fb4c9e124d in testing::UnitTest::Run() /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:5583
#13 0x55fb4c97a0b6 in RUN_ALL_TESTS() /home/kefu/dev/ceph/src/googletest/googletest/include/gtest/gtest.h:2334
#14 0x55fb4c979ffc in main /home/kefu/dev/ceph/src/googletest/googlemock/src/gmock_main.cc:71
#15 0x7f71ae833ca7 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
SUMMARY: AddressSanitizer: 1800 byte(s) leaked in 15 allocation(s).
```
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
ifed01
pushed a commit
that referenced
this pull request
May 30, 2025
Previously, we had memory leak in the test_bluestore_types.cc tests where
`BufferCacheShard` and `OnodeCacheShard` objects were allocated with
raw pointers but never freed, causing leaks detected by AddressSanitizer.
ASan rightly pointed this out:
```
Direct leak of 224 byte(s) in 1 object(s) allocated from:
#0 0x55a7432a079d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_bluestore_types+0xf2e79d) (BuildId: c3bec647afa97df6bb147bc82eac937531fc6272)
#1 0x55a743523340 in BlueStore::BufferCacheShard::create(BlueStore*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, ceph::common::PerfCounters*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/os/bluestore/Bl
ueStore.cc:1678:9
#2 0x55a74330b617 in ExtentMap_seek_lextent_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/objectstore/test_bluestore_types.cc:1077:7
#3 0x55a7434f2b2d in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.
cc:2653:10
#4 0x55a7434b5775 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:
2689:14
#5 0x55a74347005d in testing::Test::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2728:5
```
```
Direct leak of 9928 byte(s) in 1 object(s) allocated from:
#0 0x7ff249d21a2d in operator new(unsigned long) /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_new_delete.cpp:86
#1 0x6048ed878b76 in BlueStore::OnodeCacheShard::create(ceph::common::CephContext*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::common::PerfCounters*) /home/kefu/dev/ceph/src/os/bluestore/BlueStore.cc:1219
#2 0x6048ed66d4f9 in GarbageCollector_BasicTest_Test::TestBody() /home/kefu/dev/ceph/src/test/objectstore/test_bluestore_types.cc:2662
#3 0x6048ed820555 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2653
#4 0x6048ed80c78a in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2689
#5 0x6048ed7b8bfa in testing::Test::Run() /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2728
```
In this change, we replace raw pointer allocation with unique_ptr to
ensure automatic cleanup when the objects go out of scope.
`
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
cbodley
pushed a commit
that referenced
this pull request
Jun 9, 2025
Memory leak was detected by AddressSanitizer in dbstore tests. Objects
inserted into a static objectmap were not being properly freed when
tests completed without explicitly deleting buckets.
The leak occurred because:
- Objects were preserved in a static map after insertion
- Objects were only freed by DB::objectmapDelete() when deleting the
corresponding bucket
- In tests, buckets weren't always deleted after insertion, leaving
objects allocated
ASan rightly pointed this out:
```
Direct leak of 200 byte(s) in 1 object(s) allocated from:
#0 0x55c420f5274d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_dbstore_tests+0x4df74d) (BuildId: c19ed2d2b1ead306fdc59fc311f546a287300010)
#1 0x55c42143cfaf in SQLGetBucket::Execute(DoutPrefixProvider const*, rgw::store::DBOpParams*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/rgw/driver/dbstore/sqlite/sqliteDB.cc:1689:11
#2 0x55c42102751c in rgw::store::DB::ProcessOp(DoutPrefixProvider const*, std::basic_string_view<char, std::char_traits<char>>, rgw::store::DBOpParams*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/rgw/driver/dbstore/common/dbstore.cc:251:16
#3 0x55c4210328c9 in rgw::store::DB::get_bucket_info(DoutPrefixProvider const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, RGWBucketInfo&, std::map<std::__cxx11
::basic_string<char, std::char_traits<char>, std::allocator<char>>, ceph::buffer::v15_2_0::list, std::less<std::__cxx11::basic_string<char, std::c
har_traits<char>, std::allocator<char>>>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const, ceph::buffer::v15_2_0::list>>>*, std::chrono::time_point<ceph::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l>>>*, obj_version*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/rgw/driver/dbstore/common/dbstore.cc:450:9
#4 0x55c421034357 in rgw::store::DB::create_bucket(DoutPrefixProvider const*, std::variant<rgw_user, rgw_account_id> const&, rgw_bucket const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, rgw_placement_rule const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, ceph::buffer::v15_2_0::list, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const, ceph::buffer::v15_2_0::list>>> const&, std::optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>> const&, std::optional<RGWQuotaInfo> const&, std::optional<std::chrono::time_point<ceph::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l>>>>, obj_version*, RGWBucketInfo&, optional_yield) /home/jenkins-build/build/workspace/ceph-pull-requests/src/rgw/driver/dbstore/common/dbstore.cc:504:9
#5 0x55c420f6c3ed in DBStoreTest_CreateBucket_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/rgw/driver/dbstore/tests/dbstore_tests.cc:495:13
#6 0x55c42174fa0d in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2653:10
#7 0x55c421717f25 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2689:14
#8 0x55c4216d4bbd in testing::Test::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2728:5
```
In this change, we:
- Free objectmap entries in DB::Destroy() to ensure cleanup on shutdown
- Call DB::Destroy() in DBStoreTest::TearDown() to guarantee cleanup after each test
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
tchaikov
added a commit
that referenced
this pull request
Jun 10, 2025
Previously, error messages passed to luaL_error() were formatted using
std::string concatenation. Since luaL_error() never returns (it throws
a Lua exception via longjmp), the allocated std::string memory was
leaked, as detected by AddressSanitizer:
```
Direct leak of 105 byte(s) in 1 object(s) allocated from:
#0 0x7fc5f1921a2d in operator new(unsigned long) /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_new_delete.cpp:86
#1 0x563bd89144c7 in std::__new_allocator<char>::allocate(unsigned long, void const*) /usr/include/c++/15.1.1/bits/new_allocator.h:151
#2 0x563bd89144c7 in std::allocator<char>::allocate(unsigned long) /usr/include/c++/15.1.1/bits/allocator.h:203
#3 0x563bd89144c7 in std::allocator_traits<std::allocator<char> >::allocate(std::allocator<char>&, unsigned long) /usr/include/c++/15.1.1/bits/alloc_traits.h:614
#4 0x563bd89144c7 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_S_allocate(std::allocator<char>&, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.h:142
#5 0x563bd89144c7 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_create(unsigned long&, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.tcc:164
#6 0x563bd896ae1b in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_mutate(unsigned long, unsigned long, char const*, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.tcc:363
#7 0x563bd896b256 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_append(char const*, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.tcc:455
#8 0x563bd896b2bb in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::append(char const*) /usr/include/c++/15.1.1/bits/basic_string.h:1585
#9 0x563bd943c2f2 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::operator+<char, std::char_traits<char>, std::allocator<char> >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&, char const*) /usr/include/c++/15.1.1/bits/basic_string.h:3977
#10 0x563bd943c2f2 in rgw::lua::lua_state_guard::runtime_hook(lua_State*, lua_Debug*) /home/kefu/dev/ceph/src/rgw/rgw_lua_utils.cc:245
#11 0x7fc5f139f8ef (/usr/lib/liblua.so.5.4+0xe8ef) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
#12 0x7fc5f139fbfe (/usr/lib/liblua.so.5.4+0xebfe) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
#13 0x7fc5f13b26fe (/usr/lib/liblua.so.5.4+0x216fe) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
#14 0x7fc5f139f581 (/usr/lib/liblua.so.5.4+0xe581) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
#15 0x7fc5f139b735 (/usr/lib/liblua.so.5.4+0xa735) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
#16 0x7fc5f139ba8f (/usr/lib/liblua.so.5.4+0xaa8f) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
#17 0x7fc5f139f696 in lua_pcallk (/usr/lib/liblua.so.5.4+0xe696) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
#18 0x563bd8a925ef in rgw::lua::request::execute(rgw::sal::Driver*, RGWREST*, OpsLogSink*, req_state*, RGWOp*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /home/kefu/dev/ceph/src/rgw/rgw_lua_request.cc:824
#19 0x563bd8952e3d in TestRGWLua_LuaRuntimeLimit_Test::TestBody() /home/kefu/dev/ceph/src/test/rgw/test_rgw_lua.cc:1628
#20 0x563bd8a37817 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2653
#21 0x563bd8a509b5 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (/home/kefu/dev/ceph/build/bin/unittest_rgw_lua+0x11199b5) (BuildId: b2628caba5290d882d25f7bea166f058b682bc85)`
```
This change replaces std::string formatting with stack-allocated buffer
and std::to_chars() to eliminate the memory leak.
Note: We cannot format int64_t directly through luaL_error() because
lua_pushfstring() does not support long long or int64_t format specifiers,
even in Lua 5.4 (see https://www.lua.org/manual/5.4/manual.html#lua_pushfstring).
Since libstdc++ uses int64_t for std::chrono::milliseconds::rep, we use
std::to_chars() for safe, efficient conversion without heap allocation.
The maximum runtime limit was a configuration introduced by 3e3cb15.
Fixes: https://tracker.ceph.com/issues/71595
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
aainscow
pushed a commit
to aainscow/ceph
that referenced
this pull request
Jun 10, 2025
Previously, in __ceph_abort and related abort handlers, we allocated
ClibBackTrace instances using raw pointers without proper cleanup. Since
these handlers terminate execution, the leaks didn't affect production
systems but were correctly flagged by ASan during testing:
```
Direct leak of 288 byte(s) in 1 object(s) allocated from:
#0 0x55aefe8cb65d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_ceph_assert+0x1f465d) (BuildId: a4faeddac80b0d81062bd53ede3388c0c10680bc)
ceph#1 0x7f3b84da988d in ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...) /home/jenkins-build/build/workspace/ceph-pull-requests/src/common/assert.cc:157:21
ceph#2 0x55aefe8cf04b in supressed_assertf_line22() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/ceph_assert.cc:22:3
ceph#3 0x55aefe8ce4e6 in CephAssertDeathTest_ceph_assert_supresssions_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/ceph_assert.cc:31:3
ceph#4 0x55aefe99135d in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2653:10
ceph#5 0x55aefe94f015 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2689:14
...
```
This commit resolves the issue by using std::unique_ptr to manage the
lifecycle of backtrace objects, ensuring proper cleanup even in
non-returning functions. While these leaks had no practical impact in
production (as the process terminates anyway), fixing them improves code
quality and eliminates false positives in memory analysis tools.
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
aainscow
pushed a commit
to aainscow/ceph
that referenced
this pull request
Jun 10, 2025
…ms in destructor
Fix memory leak in test_not_before_queue identified by AddressSanitizer.
Previously, the test was terminating without properly dequeueing all elements,
causing resource leaks during teardown.
This change implements a proper cleanup mechanism that:
1. Advances the queue's time until all elements become eligible for processing
2. Dequeues all remaining elements to ensure proper destruction
3. Guarantees clean teardown even for elements with future timestamps
The fix eliminates ASan-reported leaks occurring in the not_before_queue_t::enqueue
method when test cases were torn down prematurely.
A sample of the error from ASan:
```
Direct leak of 1800 byte(s) in 15 object(s) allocated from:
#0 0x7f71b1f1ab5b in operator new(unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:86
ceph#1 0x55fb4c977058 in void not_before_queue_t<tv_t, test_time_t>::enqueue<tv_t const&>(tv_t const&) /home/kefu/dev/ceph/src/common/not_before_queue.h:164
ceph#2 0x55fb4c97748b in NotBeforeTest::load_test_data(std::vector<tv_t, std::allocator<tv_t> > const&) /home/kefu/dev/ceph/src/test/test_not_before_queue.cc:67
ceph#3 0x55fb4c961112 in NotBeforeTest_RemoveIfByClass_no_cond_Test::TestBody() /home/kefu/dev/ceph/src/test/test_not_before_queue.cc:213
ceph#4 0x55fb4ca05f67 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2653
ceph#5 0x55fb4ca1c4f7 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2689
ceph#6 0x55fb4c9e1104 in testing::Test::Run() /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2728
ceph#7 0x55fb4c9e16e2 in testing::TestInfo::Run() /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2874
ceph#8 0x55fb4c9e73b4 in testing::TestSuite::Run() /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:3052
ceph#9 0x55fb4c9f059b in testing::internal::UnitTestImpl::RunAllTests() /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:6004
ceph#10 0x55fb4ca064ff in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2653
ceph#11 0x55fb4ca1d1bf in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2689
ceph#12 0x55fb4c9e124d in testing::UnitTest::Run() /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:5583
ceph#13 0x55fb4c97a0b6 in RUN_ALL_TESTS() /home/kefu/dev/ceph/src/googletest/googletest/include/gtest/gtest.h:2334
ceph#14 0x55fb4c979ffc in main /home/kefu/dev/ceph/src/googletest/googlemock/src/gmock_main.cc:71
ceph#15 0x7f71ae833ca7 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
SUMMARY: AddressSanitizer: 1800 byte(s) leaked in 15 allocation(s).
```
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
aainscow
pushed a commit
to aainscow/ceph
that referenced
this pull request
Jun 10, 2025
Previously, we had memory leak in the test_bluestore_types.cc tests where
`BufferCacheShard` and `OnodeCacheShard` objects were allocated with
raw pointers but never freed, causing leaks detected by AddressSanitizer.
ASan rightly pointed this out:
```
Direct leak of 224 byte(s) in 1 object(s) allocated from:
#0 0x55a7432a079d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_bluestore_types+0xf2e79d) (BuildId: c3bec647afa97df6bb147bc82eac937531fc6272)
ceph#1 0x55a743523340 in BlueStore::BufferCacheShard::create(BlueStore*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, ceph::common::PerfCounters*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/os/bluestore/Bl
ueStore.cc:1678:9
ceph#2 0x55a74330b617 in ExtentMap_seek_lextent_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/objectstore/test_bluestore_types.cc:1077:7
ceph#3 0x55a7434f2b2d in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.
cc:2653:10
ceph#4 0x55a7434b5775 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:
2689:14
ceph#5 0x55a74347005d in testing::Test::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2728:5
```
```
Direct leak of 9928 byte(s) in 1 object(s) allocated from:
#0 0x7ff249d21a2d in operator new(unsigned long) /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_new_delete.cpp:86
ceph#1 0x6048ed878b76 in BlueStore::OnodeCacheShard::create(ceph::common::CephContext*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::common::PerfCounters*) /home/kefu/dev/ceph/src/os/bluestore/BlueStore.cc:1219
ceph#2 0x6048ed66d4f9 in GarbageCollector_BasicTest_Test::TestBody() /home/kefu/dev/ceph/src/test/objectstore/test_bluestore_types.cc:2662
ceph#3 0x6048ed820555 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2653
ceph#4 0x6048ed80c78a in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2689
ceph#5 0x6048ed7b8bfa in testing::Test::Run() /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2728
```
In this change, we replace raw pointer allocation with unique_ptr to
ensure automatic cleanup when the objects go out of scope.
`
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
aainscow
pushed a commit
to aainscow/ceph
that referenced
this pull request
Jun 10, 2025
Memory leak was detected by AddressSanitizer in dbstore tests. Objects
inserted into a static objectmap were not being properly freed when
tests completed without explicitly deleting buckets.
The leak occurred because:
- Objects were preserved in a static map after insertion
- Objects were only freed by DB::objectmapDelete() when deleting the
corresponding bucket
- In tests, buckets weren't always deleted after insertion, leaving
objects allocated
ASan rightly pointed this out:
```
Direct leak of 200 byte(s) in 1 object(s) allocated from:
#0 0x55c420f5274d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_dbstore_tests+0x4df74d) (BuildId: c19ed2d2b1ead306fdc59fc311f546a287300010)
ceph#1 0x55c42143cfaf in SQLGetBucket::Execute(DoutPrefixProvider const*, rgw::store::DBOpParams*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/rgw/driver/dbstore/sqlite/sqliteDB.cc:1689:11
ceph#2 0x55c42102751c in rgw::store::DB::ProcessOp(DoutPrefixProvider const*, std::basic_string_view<char, std::char_traits<char>>, rgw::store::DBOpParams*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/rgw/driver/dbstore/common/dbstore.cc:251:16
ceph#3 0x55c4210328c9 in rgw::store::DB::get_bucket_info(DoutPrefixProvider const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, RGWBucketInfo&, std::map<std::__cxx11
::basic_string<char, std::char_traits<char>, std::allocator<char>>, ceph::buffer::v15_2_0::list, std::less<std::__cxx11::basic_string<char, std::c
har_traits<char>, std::allocator<char>>>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const, ceph::buffer::v15_2_0::list>>>*, std::chrono::time_point<ceph::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l>>>*, obj_version*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/rgw/driver/dbstore/common/dbstore.cc:450:9
ceph#4 0x55c421034357 in rgw::store::DB::create_bucket(DoutPrefixProvider const*, std::variant<rgw_user, rgw_account_id> const&, rgw_bucket const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, rgw_placement_rule const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, ceph::buffer::v15_2_0::list, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const, ceph::buffer::v15_2_0::list>>> const&, std::optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>> const&, std::optional<RGWQuotaInfo> const&, std::optional<std::chrono::time_point<ceph::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l>>>>, obj_version*, RGWBucketInfo&, optional_yield) /home/jenkins-build/build/workspace/ceph-pull-requests/src/rgw/driver/dbstore/common/dbstore.cc:504:9
ceph#5 0x55c420f6c3ed in DBStoreTest_CreateBucket_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/rgw/driver/dbstore/tests/dbstore_tests.cc:495:13
ceph#6 0x55c42174fa0d in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2653:10
ceph#7 0x55c421717f25 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2689:14
ceph#8 0x55c4216d4bbd in testing::Test::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2728:5
```
In this change, we:
- Free objectmap entries in DB::Destroy() to ensure cleanup on shutdown
- Call DB::Destroy() in DBStoreTest::TearDown() to guarantee cleanup after each test
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
tchaikov
added a commit
that referenced
this pull request
Jun 12, 2025
…write
Fix 4KB memory leak in ErasureCodePlugin_parity_delta_write_Test caused by
unmanaged raw buffer allocation. The test was allocating a 4096-byte raw
buffer to replace shard 4 for delta encoding validation, but the buffer::ptr
constructed from the raw pointer did not manage the buffer's lifecycle.
Detected by AddressSanitizer:
```
Direct leak of 4096 byte(s) in 1 object(s) allocated from:
#0 0x7fb73a720e15 in malloc /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_malloc_linux.cpp:67
#1 0x5562f4062ccc in ErasureCodePlugin_parity_delta_write_Test::TestBody() /home/kefu/dev/ceph/src/test/erasure-code/TestErasureCodePluginJerasure.cc:122
#2 0x5562f41081a1 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2653
#3 0x5562f40f3004 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2689
#4 0x5562f409cbba in testing::Test::Run() /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2728```
```
In this change, we replace raw pointer allocation with
create_bufferptr() to ensure proper memory management by buffer::ptr.
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
cbodley
pushed a commit
that referenced
this pull request
Jun 12, 2025
Previously, we had Resource leak in Directory::for_each() where
directory streams opened with fdopendir() were never properly closed,
causing a 32KB leak per unclosed directory handle.
In this change, we add closedir() call to properly release directory stream
resources after enumeration completes.
ASan report:
```
Direct leak of 32816 byte(s) in 1 object(s) allocated from:
#0 0x738387320e15 in malloc /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_malloc_linux.cpp:67
#1 0x738381383514 (/usr/lib/libc.so.6+0xe3514) (BuildId: 468e3585c794491a48ea75fceb9e4d6b1464fc35)
#2 0x738381383418 in fdopendir (/usr/lib/libc.so.6+0xe3418) (BuildId: 468e3585c794491a48ea75fceb9e4d6b1464fc35)
#3 0x6433e1aefac4 in for_each<rgw::sal::Directory::fill_cache(const DoutPrefixProvider*, optional_yield, rgw::sal::fill_cache_cb_t&)::<lambda(char const*)> > /home/kefu/dev/ceph/src/rgw/driver/posix/rgw_sal_posix.cc:874
#4 0x6433e1aa10ab in rgw::sal::Directory::fill_cache(DoutPrefixProvider const*, optional_yield, fu2::abi_310::detail::function<fu2::abi_310::detail::config<true, false, 16ul>, fu2::abi_310::detail::property<true, false, int (DoutPrefixProvider const*, rgw_bucket_dir_ent
ry&) const> > const&) /home/kefu/dev/ceph/src/rgw/driver/posix/rgw_sal_posix.cc:1077
#5 0x6433e1aa8974 in rgw::sal::MPDirectory::fill_cache(DoutPrefixProvider const*, optional_yield, fu2::abi_310::detail::function<fu2::abi_310::detail::config<true, false, 16ul>, fu2::abi_310::detail::property<true, false, int (DoutPrefixProvider const*, rgw_bucket_dir_e
ntry&) const> > const&) /home/kefu/dev/ceph/src/rgw/driver/posix/rgw_sal_posix.cc:1384
#6 0x6433e1abf1ee in rgw::sal::POSIXBucket::fill_cache(DoutPrefixProvider const*, optional_yield, fu2::abi_310::detail::function<fu2::abi_310::detail::config<true, false, 16ul>, fu2::abi_310::detail::property<true, false, int (DoutPrefixProvider const*, rgw_bucket_dir_e
ntry&) const> > const&) /home/kefu/dev/ceph/src/rgw/driver/posix/rgw_sal_posix.cc:2236
#7 0x6433e1c43956 in file::listing::BucketCache<rgw::sal::POSIXDriver, rgw::sal::POSIXBucket>::fill(DoutPrefixProvider const*, file::listing::BucketCacheEntry<rgw::sal::POSIXDriver, rgw::sal::POSIXBucket>*, rgw::sal::POSIXBucket*, unsigned int, optional_yield) /home/kef
u/dev/ceph/src/rgw/driver/posix/bucket_cache.h:368
#8 0x6433e1c1c71c in file::listing::BucketCache<rgw::sal::POSIXDriver, rgw::sal::POSIXBucket>::list_bucket(DoutPrefixProvider const*, optional_yield, rgw::sal::POSIXBucket*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, fu2::abi_310::d
etail::function<fu2::abi_310::detail::config<true, false, 16ul>, fu2::abi_310::detail::property<true, false, bool (rgw_bucket_dir_entry const&) const> >) /home/kefu/dev/ceph/src/rgw/driver/posix/bucket_cache.h:410
#9 0x6433e1ac1829 in rgw::sal::POSIXBucket::list(DoutPrefixProvider const*, rgw::sal::Bucket::ListParams&, int, rgw::sal::Bucket::ListResults&, optional_yield) /home/kefu/dev/ceph/src/rgw/driver/posix/rgw_sal_posix.cc:2256
#10 0x6433e1ae1302 in rgw::sal::POSIXMultipartUpload::list_parts(DoutPrefixProvider const*, ceph::common::CephContext*, int, int, int*, bool*, optional_yield, bool) /home/kefu/dev/ceph/src/rgw/driver/posix/rgw_sal_posix.cc:3692
#11 0x6433e1ae2f3b in rgw::sal::POSIXMultipartUpload::complete(DoutPrefixProvider const*, optional_yield, ceph::common::CephContext*, std::map<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<int>, std::allocator<std::pair<
int const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >&, std::__cxx11::list<rgw_obj_index_key, std::allocator<rgw_obj_index_key> >&, unsigned long&, bool&, RGWCompressionInfo&, long&, std::__cxx11::basic_string<char, std::char_trait
s<char>, std::allocator<char> >&, ACLOwner&, unsigned long, rgw::sal::Object*, boost::container::flat_map<unsigned int, boost::container::flat_set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std
::char_traits<char>, std::allocator<char> > >, void>, std::less<unsigned int>, void>&) /home/kefu/dev/ceph/src/rgw/driver/posix/rgw_sal_posix.cc:3770
#12 0x6433e0fffb73 in POSIXMPObjectTest::create_MPObj(rgw::sal::MultipartUpload*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) /home/kefu/dev/ceph/src/test/rgw/test_rgw_posix_driver.cc:1717
#13 0x6433e0e95ab9 in POSIXMPObjectTest_BucketList_Test::TestBody() /home/kefu/dev/ceph/src/test/rgw/test_rgw_posix_driver.cc:1863
#14 0x6433e1308d17 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2653
```
Fixes https://tracker.ceph.com/issues/71505
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
idryomov
pushed a commit
that referenced
this pull request
Jun 19, 2025
Previously, SyncPoint allocated two C_Gather instances tracked by raw
pointers but failed to properly clean them up when only a single sync
point existed, causing memory leaks detected by AddressSanitizer.
This change fixes the leak by modifying AbstractWriteLog::shut_down()
to check for prior sync points in the chain. When the current sync point
is the only one present, we now activate the m_prior_log_entries_persisted
context to ensure:
- The onfinish callback executes and releases the captured strong
reference to the enclosing SyncPoint
- The parent m_sync_point_persist context completes and gets properly
released
This ensures all allocated contexts are cleaned up correctly during
shutdown, eliminating the memory leak.
The ASan report:
```
Indirect leak of 2064 byte(s) in 1 object(s) allocated from:
#0 0x56440919ae2d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_librbd+0x2f3de2d) (BuildId: 6a04677c6ee5235f1a41815df807f97c5b96d4cd)
#1 0x56440bd67751 in __gnu_cxx::new_allocator<Context*>::allocate(unsigned long, void const*) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/new_allocator.h:127:27
#2 0x56440bd676e0 in std::allocator<Context*>::allocate(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/allocator.h:185:32
#3 0x56440bd676e0 in std::allocator_traits<std::allocator<Context*>>::allocate(std::allocator<Context*>&, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:464:20
#4 0x56440bd6730b in std::_Vector_base<Context*, std::allocator<Context*>>::_M_allocate(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:346:20
#5 0x7fd33e00e8d1 in std::vector<Context*, std::allocator<Context*>>::reserve(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/vector.tcc:78:22
#6 0x7fd33e00c51c in librbd::cache::pwl::SyncPoint::SyncPoint(unsigned long, ceph::common::CephContext*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/SyncPoint.cc:20:27
#7 0x56440bd65f26 in decltype(::new((void*)(0)) librbd::cache::pwl::SyncPoint(std::declval<unsigned long&>(), std::declval<ceph::common::CephContext*&>())) std::construct_at<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(librbd::cache::pwl::SyncPoint*, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_construct.h:97:39
#8 0x56440bd65b98 in void std::allocator_traits<std::allocator<librbd::cache::pwl::SyncPoint>>::construct<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint>&, librbd::cache::pwl::SyncPoint*, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:518:4
#9 0x56440bd657d3 in std::_Sp_counted_ptr_inplace<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:519:4
#10 0x56440bd65371 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(librbd::cache::pwl::SyncPoint*&, std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:651:6
#11 0x56440bd65163 in std::__shared_ptr<librbd::cache::pwl::SyncPoint, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:1342:14
#12 0x56440bd650e6 in std::shared_ptr<librbd::cache::pwl::SyncPoint>::shared_ptr<std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:409:4
#13 0x56440bd65057 in std::shared_ptr<librbd::cache::pwl::SyncPoint> std::allocate_shared<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint> const&, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:862:14
#14 0x56440bca97e7 in std::shared_ptr<librbd::cache::pwl::SyncPoint> std::make_shared<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:878:14
#15 0x56440bd443c8 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::new_sync_point(librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1905:20
#16 0x56440bd42e4c in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::flush_new_sync_point(librbd::cache::pwl::C_FlushRequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1951:3
#17 0x56440bd9cbf2 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::flush_new_sync_point_if_needed(librbd::cache::pwl::C_FlushRequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1990:5
#18 0x56440bd9c636 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*)::'lambda'(librbd::cache::pwl::GuardedRequestFunctionContext&)::operator()(librbd::cache::pwl::GuardedRequestFunctionContext&) const /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:2152:9
#19 0x56440bd9b9b4 in boost::detail::function::void_function_obj_invoker<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*)::'lambda'(librbd::cache::pwl::GuardedRequestFunctionContext&), void, librbd::cache::pwl::GuardedRequestFunctionContext&>::invoke(boost::detail::function::function_buffer&, librbd::cache::pwl::GuardedRequestFunctionContext&) /opt/ceph/include/boost/function/function_template.hpp:100:11
#20 0x56440bd29321 in boost::function_n<void, librbd::cache::pwl::GuardedRequestFunctionContext&>::operator()(librbd::cache::pwl::GuardedRequestFunctionContext&) const /opt/ceph/include/boost/function/function_template.hpp:789:14
#21 0x56440bd28d85 in librbd::cache::pwl::GuardedRequestFunctionContext::finish(int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/Request.h:335:5
#22 0x5644091e0fe0 in Context::complete(int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/include/Context.h:102:5
#23 0x56440bd9b378 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::detain_guarded_request(librbd::cache::pwl::C_BlockIORequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::GuardedRequestFunctionContext*, bool) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1202:20
#24 0x56440bd96c50 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:2154:3
#25 0x56440bd1e4b5 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::shut_down(Context*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:703:3
#26 0x56440bdb9022 in librbd::cache::pwl::TestMockCacheSSDWriteLog_compare_and_write_compare_matched_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/librbd/cache/pwl/test_mock_SSDWriteLog.cc:403:7
```
Fixes: https://tracker.ceph.com/issues/71335
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
idryomov
pushed a commit
to idryomov/ceph
that referenced
this pull request
Jun 23, 2025
Previously, SyncPoint allocated two C_Gather instances tracked by raw
pointers but failed to properly clean them up when only a single sync
point existed, causing memory leaks detected by AddressSanitizer.
This change fixes the leak by modifying AbstractWriteLog::shut_down()
to check for prior sync points in the chain. When the current sync point
is the only one present, we now activate the m_prior_log_entries_persisted
context to ensure:
- The onfinish callback executes and releases the captured strong
reference to the enclosing SyncPoint
- The parent m_sync_point_persist context completes and gets properly
released
This ensures all allocated contexts are cleaned up correctly during
shutdown, eliminating the memory leak.
The ASan report:
```
Indirect leak of 2064 byte(s) in 1 object(s) allocated from:
#0 0x56440919ae2d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_librbd+0x2f3de2d) (BuildId: 6a04677c6ee5235f1a41815df807f97c5b96d4cd)
ceph#1 0x56440bd67751 in __gnu_cxx::new_allocator<Context*>::allocate(unsigned long, void const*) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/new_allocator.h:127:27
ceph#2 0x56440bd676e0 in std::allocator<Context*>::allocate(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/allocator.h:185:32
ceph#3 0x56440bd676e0 in std::allocator_traits<std::allocator<Context*>>::allocate(std::allocator<Context*>&, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:464:20
ceph#4 0x56440bd6730b in std::_Vector_base<Context*, std::allocator<Context*>>::_M_allocate(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:346:20
ceph#5 0x7fd33e00e8d1 in std::vector<Context*, std::allocator<Context*>>::reserve(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/vector.tcc:78:22
ceph#6 0x7fd33e00c51c in librbd::cache::pwl::SyncPoint::SyncPoint(unsigned long, ceph::common::CephContext*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/SyncPoint.cc:20:27
ceph#7 0x56440bd65f26 in decltype(::new((void*)(0)) librbd::cache::pwl::SyncPoint(std::declval<unsigned long&>(), std::declval<ceph::common::CephContext*&>())) std::construct_at<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(librbd::cache::pwl::SyncPoint*, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_construct.h:97:39
ceph#8 0x56440bd65b98 in void std::allocator_traits<std::allocator<librbd::cache::pwl::SyncPoint>>::construct<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint>&, librbd::cache::pwl::SyncPoint*, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:518:4
ceph#9 0x56440bd657d3 in std::_Sp_counted_ptr_inplace<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:519:4
ceph#10 0x56440bd65371 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(librbd::cache::pwl::SyncPoint*&, std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:651:6
ceph#11 0x56440bd65163 in std::__shared_ptr<librbd::cache::pwl::SyncPoint, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:1342:14
ceph#12 0x56440bd650e6 in std::shared_ptr<librbd::cache::pwl::SyncPoint>::shared_ptr<std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:409:4
ceph#13 0x56440bd65057 in std::shared_ptr<librbd::cache::pwl::SyncPoint> std::allocate_shared<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint> const&, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:862:14
ceph#14 0x56440bca97e7 in std::shared_ptr<librbd::cache::pwl::SyncPoint> std::make_shared<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:878:14
ceph#15 0x56440bd443c8 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::new_sync_point(librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1905:20
ceph#16 0x56440bd42e4c in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::flush_new_sync_point(librbd::cache::pwl::C_FlushRequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1951:3
ceph#17 0x56440bd9cbf2 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::flush_new_sync_point_if_needed(librbd::cache::pwl::C_FlushRequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1990:5
ceph#18 0x56440bd9c636 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*)::'lambda'(librbd::cache::pwl::GuardedRequestFunctionContext&)::operator()(librbd::cache::pwl::GuardedRequestFunctionContext&) const /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:2152:9
ceph#19 0x56440bd9b9b4 in boost::detail::function::void_function_obj_invoker<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*)::'lambda'(librbd::cache::pwl::GuardedRequestFunctionContext&), void, librbd::cache::pwl::GuardedRequestFunctionContext&>::invoke(boost::detail::function::function_buffer&, librbd::cache::pwl::GuardedRequestFunctionContext&) /opt/ceph/include/boost/function/function_template.hpp:100:11
ceph#20 0x56440bd29321 in boost::function_n<void, librbd::cache::pwl::GuardedRequestFunctionContext&>::operator()(librbd::cache::pwl::GuardedRequestFunctionContext&) const /opt/ceph/include/boost/function/function_template.hpp:789:14
ceph#21 0x56440bd28d85 in librbd::cache::pwl::GuardedRequestFunctionContext::finish(int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/Request.h:335:5
ceph#22 0x5644091e0fe0 in Context::complete(int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/include/Context.h:102:5
ceph#23 0x56440bd9b378 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::detain_guarded_request(librbd::cache::pwl::C_BlockIORequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::GuardedRequestFunctionContext*, bool) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1202:20
ceph#24 0x56440bd96c50 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:2154:3
ceph#25 0x56440bd1e4b5 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::shut_down(Context*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:703:3
ceph#26 0x56440bdb9022 in librbd::cache::pwl::TestMockCacheSSDWriteLog_compare_and_write_compare_matched_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/librbd/cache/pwl/test_mock_SSDWriteLog.cc:403:7
```
Fixes: https://tracker.ceph.com/issues/71335
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit 05fd6f9)
idryomov
pushed a commit
to idryomov/ceph
that referenced
this pull request
Jun 23, 2025
Previously, SyncPoint allocated two C_Gather instances tracked by raw
pointers but failed to properly clean them up when only a single sync
point existed, causing memory leaks detected by AddressSanitizer.
This change fixes the leak by modifying AbstractWriteLog::shut_down()
to check for prior sync points in the chain. When the current sync point
is the only one present, we now activate the m_prior_log_entries_persisted
context to ensure:
- The onfinish callback executes and releases the captured strong
reference to the enclosing SyncPoint
- The parent m_sync_point_persist context completes and gets properly
released
This ensures all allocated contexts are cleaned up correctly during
shutdown, eliminating the memory leak.
The ASan report:
```
Indirect leak of 2064 byte(s) in 1 object(s) allocated from:
#0 0x56440919ae2d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_librbd+0x2f3de2d) (BuildId: 6a04677c6ee5235f1a41815df807f97c5b96d4cd)
ceph#1 0x56440bd67751 in __gnu_cxx::new_allocator<Context*>::allocate(unsigned long, void const*) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/new_allocator.h:127:27
ceph#2 0x56440bd676e0 in std::allocator<Context*>::allocate(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/allocator.h:185:32
ceph#3 0x56440bd676e0 in std::allocator_traits<std::allocator<Context*>>::allocate(std::allocator<Context*>&, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:464:20
ceph#4 0x56440bd6730b in std::_Vector_base<Context*, std::allocator<Context*>>::_M_allocate(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:346:20
ceph#5 0x7fd33e00e8d1 in std::vector<Context*, std::allocator<Context*>>::reserve(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/vector.tcc:78:22
ceph#6 0x7fd33e00c51c in librbd::cache::pwl::SyncPoint::SyncPoint(unsigned long, ceph::common::CephContext*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/SyncPoint.cc:20:27
ceph#7 0x56440bd65f26 in decltype(::new((void*)(0)) librbd::cache::pwl::SyncPoint(std::declval<unsigned long&>(), std::declval<ceph::common::CephContext*&>())) std::construct_at<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(librbd::cache::pwl::SyncPoint*, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_construct.h:97:39
ceph#8 0x56440bd65b98 in void std::allocator_traits<std::allocator<librbd::cache::pwl::SyncPoint>>::construct<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint>&, librbd::cache::pwl::SyncPoint*, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:518:4
ceph#9 0x56440bd657d3 in std::_Sp_counted_ptr_inplace<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:519:4
ceph#10 0x56440bd65371 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(librbd::cache::pwl::SyncPoint*&, std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:651:6
ceph#11 0x56440bd65163 in std::__shared_ptr<librbd::cache::pwl::SyncPoint, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:1342:14
ceph#12 0x56440bd650e6 in std::shared_ptr<librbd::cache::pwl::SyncPoint>::shared_ptr<std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:409:4
ceph#13 0x56440bd65057 in std::shared_ptr<librbd::cache::pwl::SyncPoint> std::allocate_shared<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint> const&, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:862:14
ceph#14 0x56440bca97e7 in std::shared_ptr<librbd::cache::pwl::SyncPoint> std::make_shared<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:878:14
ceph#15 0x56440bd443c8 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::new_sync_point(librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1905:20
ceph#16 0x56440bd42e4c in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::flush_new_sync_point(librbd::cache::pwl::C_FlushRequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1951:3
ceph#17 0x56440bd9cbf2 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::flush_new_sync_point_if_needed(librbd::cache::pwl::C_FlushRequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1990:5
ceph#18 0x56440bd9c636 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*)::'lambda'(librbd::cache::pwl::GuardedRequestFunctionContext&)::operator()(librbd::cache::pwl::GuardedRequestFunctionContext&) const /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:2152:9
ceph#19 0x56440bd9b9b4 in boost::detail::function::void_function_obj_invoker<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*)::'lambda'(librbd::cache::pwl::GuardedRequestFunctionContext&), void, librbd::cache::pwl::GuardedRequestFunctionContext&>::invoke(boost::detail::function::function_buffer&, librbd::cache::pwl::GuardedRequestFunctionContext&) /opt/ceph/include/boost/function/function_template.hpp:100:11
ceph#20 0x56440bd29321 in boost::function_n<void, librbd::cache::pwl::GuardedRequestFunctionContext&>::operator()(librbd::cache::pwl::GuardedRequestFunctionContext&) const /opt/ceph/include/boost/function/function_template.hpp:789:14
ceph#21 0x56440bd28d85 in librbd::cache::pwl::GuardedRequestFunctionContext::finish(int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/Request.h:335:5
ceph#22 0x5644091e0fe0 in Context::complete(int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/include/Context.h:102:5
ceph#23 0x56440bd9b378 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::detain_guarded_request(librbd::cache::pwl::C_BlockIORequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::GuardedRequestFunctionContext*, bool) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1202:20
ceph#24 0x56440bd96c50 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:2154:3
ceph#25 0x56440bd1e4b5 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::shut_down(Context*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:703:3
ceph#26 0x56440bdb9022 in librbd::cache::pwl::TestMockCacheSSDWriteLog_compare_and_write_compare_matched_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/librbd/cache/pwl/test_mock_SSDWriteLog.cc:403:7
```
Fixes: https://tracker.ceph.com/issues/71335
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit 05fd6f9)
idryomov
pushed a commit
to idryomov/ceph
that referenced
this pull request
Jun 23, 2025
Previously, SyncPoint allocated two C_Gather instances tracked by raw
pointers but failed to properly clean them up when only a single sync
point existed, causing memory leaks detected by AddressSanitizer.
This change fixes the leak by modifying AbstractWriteLog::shut_down()
to check for prior sync points in the chain. When the current sync point
is the only one present, we now activate the m_prior_log_entries_persisted
context to ensure:
- The onfinish callback executes and releases the captured strong
reference to the enclosing SyncPoint
- The parent m_sync_point_persist context completes and gets properly
released
This ensures all allocated contexts are cleaned up correctly during
shutdown, eliminating the memory leak.
The ASan report:
```
Indirect leak of 2064 byte(s) in 1 object(s) allocated from:
#0 0x56440919ae2d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_librbd+0x2f3de2d) (BuildId: 6a04677c6ee5235f1a41815df807f97c5b96d4cd)
ceph#1 0x56440bd67751 in __gnu_cxx::new_allocator<Context*>::allocate(unsigned long, void const*) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/new_allocator.h:127:27
ceph#2 0x56440bd676e0 in std::allocator<Context*>::allocate(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/allocator.h:185:32
ceph#3 0x56440bd676e0 in std::allocator_traits<std::allocator<Context*>>::allocate(std::allocator<Context*>&, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:464:20
ceph#4 0x56440bd6730b in std::_Vector_base<Context*, std::allocator<Context*>>::_M_allocate(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:346:20
ceph#5 0x7fd33e00e8d1 in std::vector<Context*, std::allocator<Context*>>::reserve(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/vector.tcc:78:22
ceph#6 0x7fd33e00c51c in librbd::cache::pwl::SyncPoint::SyncPoint(unsigned long, ceph::common::CephContext*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/SyncPoint.cc:20:27
ceph#7 0x56440bd65f26 in decltype(::new((void*)(0)) librbd::cache::pwl::SyncPoint(std::declval<unsigned long&>(), std::declval<ceph::common::CephContext*&>())) std::construct_at<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(librbd::cache::pwl::SyncPoint*, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_construct.h:97:39
ceph#8 0x56440bd65b98 in void std::allocator_traits<std::allocator<librbd::cache::pwl::SyncPoint>>::construct<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint>&, librbd::cache::pwl::SyncPoint*, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:518:4
ceph#9 0x56440bd657d3 in std::_Sp_counted_ptr_inplace<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:519:4
ceph#10 0x56440bd65371 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(librbd::cache::pwl::SyncPoint*&, std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:651:6
ceph#11 0x56440bd65163 in std::__shared_ptr<librbd::cache::pwl::SyncPoint, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:1342:14
ceph#12 0x56440bd650e6 in std::shared_ptr<librbd::cache::pwl::SyncPoint>::shared_ptr<std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:409:4
ceph#13 0x56440bd65057 in std::shared_ptr<librbd::cache::pwl::SyncPoint> std::allocate_shared<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint> const&, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:862:14
ceph#14 0x56440bca97e7 in std::shared_ptr<librbd::cache::pwl::SyncPoint> std::make_shared<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:878:14
ceph#15 0x56440bd443c8 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::new_sync_point(librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1905:20
ceph#16 0x56440bd42e4c in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::flush_new_sync_point(librbd::cache::pwl::C_FlushRequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1951:3
ceph#17 0x56440bd9cbf2 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::flush_new_sync_point_if_needed(librbd::cache::pwl::C_FlushRequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1990:5
ceph#18 0x56440bd9c636 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*)::'lambda'(librbd::cache::pwl::GuardedRequestFunctionContext&)::operator()(librbd::cache::pwl::GuardedRequestFunctionContext&) const /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:2152:9
ceph#19 0x56440bd9b9b4 in boost::detail::function::void_function_obj_invoker<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*)::'lambda'(librbd::cache::pwl::GuardedRequestFunctionContext&), void, librbd::cache::pwl::GuardedRequestFunctionContext&>::invoke(boost::detail::function::function_buffer&, librbd::cache::pwl::GuardedRequestFunctionContext&) /opt/ceph/include/boost/function/function_template.hpp:100:11
ceph#20 0x56440bd29321 in boost::function_n<void, librbd::cache::pwl::GuardedRequestFunctionContext&>::operator()(librbd::cache::pwl::GuardedRequestFunctionContext&) const /opt/ceph/include/boost/function/function_template.hpp:789:14
ceph#21 0x56440bd28d85 in librbd::cache::pwl::GuardedRequestFunctionContext::finish(int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/Request.h:335:5
ceph#22 0x5644091e0fe0 in Context::complete(int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/include/Context.h:102:5
ceph#23 0x56440bd9b378 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::detain_guarded_request(librbd::cache::pwl::C_BlockIORequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::GuardedRequestFunctionContext*, bool) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1202:20
ceph#24 0x56440bd96c50 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:2154:3
ceph#25 0x56440bd1e4b5 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::shut_down(Context*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:703:3
ceph#26 0x56440bdb9022 in librbd::cache::pwl::TestMockCacheSSDWriteLog_compare_and_write_compare_matched_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/librbd/cache/pwl/test_mock_SSDWriteLog.cc:403:7
```
Fixes: https://tracker.ceph.com/issues/71335
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit 05fd6f9)
SrinivasaBharath
pushed a commit
to SrinivasaBharath/ceph
that referenced
this pull request
Jul 18, 2025
Fix a memory leak where HybridAllocator's embedded BitmapAllocator
(bmap_alloc) was not properly cleaned up during destruction. The issue
occurred because virtual function calls in destructors don't dispatch
to derived class implementations - when AvlAllocator::~AvlAllocator()
calls shutdown(), it invokes AvlAllocator::shutdown() rather than
HybridAllocatorBase::shutdown(), leaving bmap_alloc unfreed.
This issue only affects unit tests and does not impact production,
as BlueFS always calls shutdown() explicitly when stopping allocators
in BlueFS::_stop_alloc()
This change has minimal impact on existing code that already calls
shutdown() explicitly, since shutdown() has idempotent behavior and
can be safely called multiple times. Additionally, destructors are
only invoked during system teardown, so the potential double shutdown()
call does not affect runtime performance.
Changes:
- Replace raw pointer bmap_alloc with unique_ptr for automatic cleanup
- Add explicit shutdown() call in BitmapAllocator destructor for RAII
- Replace manual bmap_alloc->shutdown() with bmap_alloc.reset()
- Add explicit default destructor to HybridAllocatorBase for clarity
This eliminates the need for manual shutdown() calls and ensures
proper resource cleanup through RAII principles.
Fixes ASan-reported indirect leak of BitmapAllocator resources.
Please see the leak reported by ASan for more details:
```
Indirect leak of 23 byte(s) in 1 object(s) allocated from:
#0 0x7fe30b121a2d in operator new(unsigned long) /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_new_delete.cpp:86
ceph#1 0x5614e057ee11 in std::__new_allocator<char>::allocate(unsigned long, void const*) /usr/include/c++/15.1.1/bits/new_allocator.h:151
ceph#2 0x5614e057d279 in std::allocator<char>::allocate(unsigned long) /usr/include/c++/15.1.1/bits/allocator.h:203
ceph#3 0x5614e057d279 in std::allocator_traits<std::allocator<char> >::allocate(std::allocator<char>&, unsigned long) /usr/include/c++/15.1.1/bits/alloc_traits.h:614
ceph#4 0x5614e057d279 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_S_allocate(std::allocator<char>&, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.h:142
ceph#5 0x5614e057cf27 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_create(unsigned long&, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.tcc:164
ceph#6 0x5614e05fe91b in void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<true>(char const*, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.tcc:291
ceph#7 0x5614e05f3cd4 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /usr/include/c++/15.1.
1/bits/basic_string.h:617
ceph#8 0x5614e069b892 in ceph::mutex_debug_detail::mutex_debug_impl<false>::mutex_debug_impl(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, bool) /home/kefu/dev/ceph/src/common/mutex_debug.
h:103
ceph#9 0x5614e06b8586 in ceph::mutex_debug_detail::mutex_debug_impl<false> ceph::make_mutex<char const (&) [23]>(char const (&) [23]) /home/kefu/dev/ceph/src/common/ceph_mutex.h:118
ceph#10 0x5614e06b8350 in AllocatorLevel02<AllocatorLevel01Loose>::AllocatorLevel02() /home/kefu/dev/ceph/src/os/bluestore/fastbmap_allocator_impl.h:526
ceph#11 0x5614e06b393b in BitmapAllocator::BitmapAllocator(ceph::common::CephContext*, long, long, std::basic_string_view<char, std::char_traits<char> >) /home/kefu/dev/ceph/src/os/bluestore/BitmapAllocator.cc:16
ceph#12 0x5614e072755f in HybridAllocatorBase<AvlAllocator>::_spillover_range(unsigned long, unsigned long) /home/kefu/dev/ceph/src/os/bluestore/HybridAllocator_impl.h:123
ceph#13 0x5614e06c9045 in AvlAllocator::_try_insert_range(unsigned long, unsigned long, boost::intrusive::tree_iterator<boost::intrusive::mhtraits<range_seg_t, boost::intrusive::avl_set_member_hook<>, &range_seg_t::offset_hook>,
false>*) /home/kefu/dev/ceph/src/os/bluestore/AvlAllocator.h:210
ceph#14 0x5614e06c0104 in AvlAllocator::_add_to_tree(unsigned long, unsigned long) /home/kefu/dev/ceph/src/os/bluestore/AvlAllocator.cc:136
ceph#15 0x5614e0586e5f in HybridAllocatorBase<AvlAllocator>::_add_to_tree(unsigned long, unsigned long) /home/kefu/dev/ceph/src/os/bluestore/HybridAllocator.h:110
ceph#16 0x5614e06c6559 in AvlAllocator::init_add_free(unsigned long, unsigned long) /home/kefu/dev/ceph/src/os/bluestore/AvlAllocator.cc:494
ceph#17 0x5614e0582943 in HybridAllocator_fragmentation_Test::TestBody() /home/kefu/dev/ceph/src/test/objectstore/hybrid_allocator_test.cc:285
ceph#18 0x5614e061bee3 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2653
ceph#19 0x5614e0606562 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2689`
```
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
vshankar
pushed a commit
that referenced
this pull request
Jul 24, 2025
Removing vxattr 'ceph.dir.subvolume' on a directory without it being set causes the mds to crash. This is because the snaprealm would be null for the directory and the null check is missing. Setting the vxattr, creates the snaprealm for the directory as part of it. Hence, mds doesn't crash when the vxattr is set and then removed. This patch fixes the same. Reproducer: $mkdir /mnt/dir1 $setfattr -x "ceph.dir.subvolume" /mnt/dir1 Traceback: ------- Core was generated by `./ceph/build/bin/ceph-mds -i a -c ./ceph/build/ceph.conf'. Program terminated with signal SIGSEGV, Segmentation fault. (gdb) bt #0 0x00007f33f1aa8034 in __pthread_kill_implementation () from /lib64/libc.so.6 #1 0x00007f33f1a4ef1e in raise () from /lib64/libc.so.6 #2 0x0000562b148a6fd0 in reraise_fatal (signum=signum@entry=11) at /ceph/src/global/signal_handler.cc:88 #3 0x0000562b148a83d9 in handle_oneshot_fatal_signal (signum=11) at /ceph/src/global/signal_handler.cc:367 #4 <signal handler called> #5 Server::handle_client_setvxattr (this=0x562b4ee3f800, mdr=..., cur=0x562b4ef9cc00) at /ceph/src/mds/Server.cc:6406 #6 0x0000562b145fadc2 in Server::handle_client_removexattr (this=0x562b4ee3f800, mdr=...) at /ceph/src/mds/Server.cc:7022 #7 0x0000562b145fbff0 in Server::dispatch_client_request (this=0x562b4ee3f800, mdr=...) at /ceph/src/mds/Server.cc:2825 #8 0x0000562b145fcfa2 in Server::handle_client_request (this=0x562b4ee3f800, req=...) at /ceph/src/mds/Server.cc:2676 #9 0x0000562b1460063c in Server::dispatch (this=0x562b4ee3f800, m=...) at /ceph/src/mds/Server.cc:382 #10 0x0000562b1450eb22 in MDSRank::handle_message (this=this@entry=0x562b4ef42008, m=...) at /ceph/src/mds/MDSRank.cc:1222 #11 0x0000562b14510c93 in MDSRank::_dispatch (this=this@entry=0x562b4ef42008, m=..., new_msg=new_msg@entry=true) at /ceph/src/mds/MDSRank.cc:1045 #12 0x0000562b14511620 in MDSRankDispatcher::ms_dispatch (this=this@entry=0x562b4ef42000, m=...) at /ceph/src/mds/MDSRank.cc:1019 #13 0x0000562b144ff117 in MDSDaemon::ms_dispatch2 (this=0x562b4ee64000, m=...) at /ceph/src/common/RefCountedObj.h:56 #14 0x00007f33f2f4974a in Messenger::ms_deliver_dispatch (this=0x562b4ee70000, m=...) at /ceph/src/msg/Messenger.h:746 #15 0x00007f33f2f467e2 in DispatchQueue::entry (this=0x562b4ee703b8) at /ceph/src/msg/DispatchQueue.cc:202 #16 0x00007f33f2ff61cb in DispatchQueue::DispatchThread::entry (this=<optimized out>) at /ceph/src/msg/DispatchQueue.h:101 #17 0x00007f33f2df4b5d in Thread::entry_wrapper (this=0x562b4ee70518) at /ceph/src/common/Thread.cc:87 #18 0x00007f33f2df4b6f in Thread::_entry_func (arg=<optimized out>) at /ceph/src/common/Thread.cc:74 #19 0x00007f33f1aa6088 in start_thread () from /lib64/libc.so.6 #20 0x00007f33f1b29f8c in clone3 () from /lib64/libc.so.6 --------- Fixes: https://tracker.ceph.com/issues/70794 Signed-off-by: Kotresh HR <khiremat@redhat.com>
stzuraski898
pushed a commit
to stzuraski898/ceph
that referenced
this pull request
Jul 29, 2025
Removing vxattr 'ceph.dir.subvolume' on a directory without it being set causes the mds to crash. This is because the snaprealm would be null for the directory and the null check is missing. Setting the vxattr, creates the snaprealm for the directory as part of it. Hence, mds doesn't crash when the vxattr is set and then removed. This patch fixes the same. Reproducer: $mkdir /mnt/dir1 $setfattr -x "ceph.dir.subvolume" /mnt/dir1 Traceback: ------- Core was generated by `./ceph/build/bin/ceph-mds -i a -c ./ceph/build/ceph.conf'. Program terminated with signal SIGSEGV, Segmentation fault. (gdb) bt #0 0x00007f33f1aa8034 in __pthread_kill_implementation () from /lib64/libc.so.6 ceph#1 0x00007f33f1a4ef1e in raise () from /lib64/libc.so.6 ceph#2 0x0000562b148a6fd0 in reraise_fatal (signum=signum@entry=11) at /ceph/src/global/signal_handler.cc:88 ceph#3 0x0000562b148a83d9 in handle_oneshot_fatal_signal (signum=11) at /ceph/src/global/signal_handler.cc:367 ceph#4 <signal handler called> ceph#5 Server::handle_client_setvxattr (this=0x562b4ee3f800, mdr=..., cur=0x562b4ef9cc00) at /ceph/src/mds/Server.cc:6406 ceph#6 0x0000562b145fadc2 in Server::handle_client_removexattr (this=0x562b4ee3f800, mdr=...) at /ceph/src/mds/Server.cc:7022 ceph#7 0x0000562b145fbff0 in Server::dispatch_client_request (this=0x562b4ee3f800, mdr=...) at /ceph/src/mds/Server.cc:2825 ceph#8 0x0000562b145fcfa2 in Server::handle_client_request (this=0x562b4ee3f800, req=...) at /ceph/src/mds/Server.cc:2676 ceph#9 0x0000562b1460063c in Server::dispatch (this=0x562b4ee3f800, m=...) at /ceph/src/mds/Server.cc:382 ceph#10 0x0000562b1450eb22 in MDSRank::handle_message (this=this@entry=0x562b4ef42008, m=...) at /ceph/src/mds/MDSRank.cc:1222 ceph#11 0x0000562b14510c93 in MDSRank::_dispatch (this=this@entry=0x562b4ef42008, m=..., new_msg=new_msg@entry=true) at /ceph/src/mds/MDSRank.cc:1045 ceph#12 0x0000562b14511620 in MDSRankDispatcher::ms_dispatch (this=this@entry=0x562b4ef42000, m=...) at /ceph/src/mds/MDSRank.cc:1019 ceph#13 0x0000562b144ff117 in MDSDaemon::ms_dispatch2 (this=0x562b4ee64000, m=...) at /ceph/src/common/RefCountedObj.h:56 ceph#14 0x00007f33f2f4974a in Messenger::ms_deliver_dispatch (this=0x562b4ee70000, m=...) at /ceph/src/msg/Messenger.h:746 ceph#15 0x00007f33f2f467e2 in DispatchQueue::entry (this=0x562b4ee703b8) at /ceph/src/msg/DispatchQueue.cc:202 ceph#16 0x00007f33f2ff61cb in DispatchQueue::DispatchThread::entry (this=<optimized out>) at /ceph/src/msg/DispatchQueue.h:101 ceph#17 0x00007f33f2df4b5d in Thread::entry_wrapper (this=0x562b4ee70518) at /ceph/src/common/Thread.cc:87 ceph#18 0x00007f33f2df4b6f in Thread::_entry_func (arg=<optimized out>) at /ceph/src/common/Thread.cc:74 ceph#19 0x00007f33f1aa6088 in start_thread () from /lib64/libc.so.6 ceph#20 0x00007f33f1b29f8c in clone3 () from /lib64/libc.so.6 --------- Fixes: https://tracker.ceph.com/issues/70794 Signed-off-by: Kotresh HR <khiremat@redhat.com>
stzuraski898
pushed a commit
to stzuraski898/ceph
that referenced
this pull request
Jul 29, 2025
Fix a memory leak where HybridAllocator's embedded BitmapAllocator
(bmap_alloc) was not properly cleaned up during destruction. The issue
occurred because virtual function calls in destructors don't dispatch
to derived class implementations - when AvlAllocator::~AvlAllocator()
calls shutdown(), it invokes AvlAllocator::shutdown() rather than
HybridAllocatorBase::shutdown(), leaving bmap_alloc unfreed.
This issue only affects unit tests and does not impact production,
as BlueFS always calls shutdown() explicitly when stopping allocators
in BlueFS::_stop_alloc()
This change has minimal impact on existing code that already calls
shutdown() explicitly, since shutdown() has idempotent behavior and
can be safely called multiple times. Additionally, destructors are
only invoked during system teardown, so the potential double shutdown()
call does not affect runtime performance.
Changes:
- Replace raw pointer bmap_alloc with unique_ptr for automatic cleanup
- Add explicit shutdown() call in BitmapAllocator destructor for RAII
- Replace manual bmap_alloc->shutdown() with bmap_alloc.reset()
- Add explicit default destructor to HybridAllocatorBase for clarity
This eliminates the need for manual shutdown() calls and ensures
proper resource cleanup through RAII principles.
Fixes ASan-reported indirect leak of BitmapAllocator resources.
Please see the leak reported by ASan for more details:
```
Indirect leak of 23 byte(s) in 1 object(s) allocated from:
#0 0x7fe30b121a2d in operator new(unsigned long) /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_new_delete.cpp:86
ceph#1 0x5614e057ee11 in std::__new_allocator<char>::allocate(unsigned long, void const*) /usr/include/c++/15.1.1/bits/new_allocator.h:151
ceph#2 0x5614e057d279 in std::allocator<char>::allocate(unsigned long) /usr/include/c++/15.1.1/bits/allocator.h:203
ceph#3 0x5614e057d279 in std::allocator_traits<std::allocator<char> >::allocate(std::allocator<char>&, unsigned long) /usr/include/c++/15.1.1/bits/alloc_traits.h:614
ceph#4 0x5614e057d279 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_S_allocate(std::allocator<char>&, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.h:142
ceph#5 0x5614e057cf27 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_create(unsigned long&, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.tcc:164
ceph#6 0x5614e05fe91b in void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<true>(char const*, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.tcc:291
ceph#7 0x5614e05f3cd4 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /usr/include/c++/15.1.
1/bits/basic_string.h:617
ceph#8 0x5614e069b892 in ceph::mutex_debug_detail::mutex_debug_impl<false>::mutex_debug_impl(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, bool) /home/kefu/dev/ceph/src/common/mutex_debug.
h:103
ceph#9 0x5614e06b8586 in ceph::mutex_debug_detail::mutex_debug_impl<false> ceph::make_mutex<char const (&) [23]>(char const (&) [23]) /home/kefu/dev/ceph/src/common/ceph_mutex.h:118
ceph#10 0x5614e06b8350 in AllocatorLevel02<AllocatorLevel01Loose>::AllocatorLevel02() /home/kefu/dev/ceph/src/os/bluestore/fastbmap_allocator_impl.h:526
ceph#11 0x5614e06b393b in BitmapAllocator::BitmapAllocator(ceph::common::CephContext*, long, long, std::basic_string_view<char, std::char_traits<char> >) /home/kefu/dev/ceph/src/os/bluestore/BitmapAllocator.cc:16
ceph#12 0x5614e072755f in HybridAllocatorBase<AvlAllocator>::_spillover_range(unsigned long, unsigned long) /home/kefu/dev/ceph/src/os/bluestore/HybridAllocator_impl.h:123
ceph#13 0x5614e06c9045 in AvlAllocator::_try_insert_range(unsigned long, unsigned long, boost::intrusive::tree_iterator<boost::intrusive::mhtraits<range_seg_t, boost::intrusive::avl_set_member_hook<>, &range_seg_t::offset_hook>,
false>*) /home/kefu/dev/ceph/src/os/bluestore/AvlAllocator.h:210
ceph#14 0x5614e06c0104 in AvlAllocator::_add_to_tree(unsigned long, unsigned long) /home/kefu/dev/ceph/src/os/bluestore/AvlAllocator.cc:136
ceph#15 0x5614e0586e5f in HybridAllocatorBase<AvlAllocator>::_add_to_tree(unsigned long, unsigned long) /home/kefu/dev/ceph/src/os/bluestore/HybridAllocator.h:110
ceph#16 0x5614e06c6559 in AvlAllocator::init_add_free(unsigned long, unsigned long) /home/kefu/dev/ceph/src/os/bluestore/AvlAllocator.cc:494
ceph#17 0x5614e0582943 in HybridAllocator_fragmentation_Test::TestBody() /home/kefu/dev/ceph/src/test/objectstore/hybrid_allocator_test.cc:285
ceph#18 0x5614e061bee3 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2653
ceph#19 0x5614e0606562 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2689`
```
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
tchaikov
added a commit
that referenced
this pull request
Aug 4, 2025
Previously, run-cli-tests ignored all environment variables from the parent
process to ensure a clean test environment. However, this also dropped
sanitizer settings (ASAN_OPTIONS and LSAN_OPTIONS) needed when AddressSanitizer
is enabled.
This causes test failures with TCMalloc due to false-positive leak reports
from TCMalloc's internal objects, which is a known issue documented in
Google's C++ style guide. While recent gperftools releases have fixed this,
Ubuntu Jammy still ships with an older version that triggers these warnings.
This change preserves sanitizer environment variables while maintaining
the clean test environment for other variables.
Note: Once we upgrade to newer gperftools, we can remove the related
suppression rule in qa/lsan.supp.
The test failure with TCMalloc looks like:
```
/home/jenkins-build/build/workspace/ceph-pull-requests/src/test/cli/ceph-kvstore-tool/help.t: failed
--- /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/cli/ceph-kvstore-tool/help.t
+++ /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/cli/ceph-kvstore-tool/help.t.err
@@ -21,3 +21,19 @@
stats
histogram [prefix]
+
+ =================================================================
+ ==87908==ERROR: LeakSanitizer: detected memory leaks
+
+ Direct leak of 45 byte(s) in 1 object(s) allocated from:
+ #0 0x562fd797265d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/ceph-kvstore-tool+0xe5e65d) (BuildId: 7eb56077b615aeb3c7aedafa0818ad89fdf3702d)
+ #1 0x562fd79815c8 in std::__new_allocator<char>::allocate(unsigned long, void const*) /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/new_allocator.h:137:27
+ #2 0x562fd7981520 in std::allocator<char>::allocate(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/allocator.h:188:32
+ #3 0x562fd7981520 in std::allocator_traits<std::allocator<char>>::allocate(std::allocator<char>&, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/alloc_traits.h:464:20
+ #4 0x562fd798115a in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>::_M_create(unsigned long&, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/basic_string.tcc:155:14
+ #5 0x562fd798787f in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>::_M_mutate(unsigned long, unsigned long, char const*, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/basic_string.tcc:328:21
+ #6 0x562fd79876a7 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>::_M_append(char const*, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/basic_string.tcc:420:8
+ #7 0x7fa1aa0286f0 in MallocExtension::Initialize() (/lib/x86_64-linux-gnu/libtcmalloc.so.4+0x2a6f0) (BuildId: eeef3d1257388a806e122398dbce3157ee568ef4)
+
+ SUMMARY: AddressSanitizer: 45 byte(s) leaked in 1 allocation(s).
```
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
ArbitCode
pushed a commit
to ArbitCode/ceph
that referenced
this pull request
Aug 19, 2025
Previously, run-cli-tests ignored all environment variables from the parent
process to ensure a clean test environment. However, this also dropped
sanitizer settings (ASAN_OPTIONS and LSAN_OPTIONS) needed when AddressSanitizer
is enabled.
This causes test failures with TCMalloc due to false-positive leak reports
from TCMalloc's internal objects, which is a known issue documented in
Google's C++ style guide. While recent gperftools releases have fixed this,
Ubuntu Jammy still ships with an older version that triggers these warnings.
This change preserves sanitizer environment variables while maintaining
the clean test environment for other variables.
Note: Once we upgrade to newer gperftools, we can remove the related
suppression rule in qa/lsan.supp.
The test failure with TCMalloc looks like:
```
/home/jenkins-build/build/workspace/ceph-pull-requests/src/test/cli/ceph-kvstore-tool/help.t: failed
--- /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/cli/ceph-kvstore-tool/help.t
+++ /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/cli/ceph-kvstore-tool/help.t.err
@@ -21,3 +21,19 @@
stats
histogram [prefix]
+
+ =================================================================
+ ==87908==ERROR: LeakSanitizer: detected memory leaks
+
+ Direct leak of 45 byte(s) in 1 object(s) allocated from:
+ #0 0x562fd797265d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/ceph-kvstore-tool+0xe5e65d) (BuildId: 7eb56077b615aeb3c7aedafa0818ad89fdf3702d)
+ ceph#1 0x562fd79815c8 in std::__new_allocator<char>::allocate(unsigned long, void const*) /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/new_allocator.h:137:27
+ ceph#2 0x562fd7981520 in std::allocator<char>::allocate(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/allocator.h:188:32
+ ceph#3 0x562fd7981520 in std::allocator_traits<std::allocator<char>>::allocate(std::allocator<char>&, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/alloc_traits.h:464:20
+ ceph#4 0x562fd798115a in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>::_M_create(unsigned long&, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/basic_string.tcc:155:14
+ ceph#5 0x562fd798787f in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>::_M_mutate(unsigned long, unsigned long, char const*, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/basic_string.tcc:328:21
+ ceph#6 0x562fd79876a7 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>::_M_append(char const*, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/basic_string.tcc:420:8
+ ceph#7 0x7fa1aa0286f0 in MallocExtension::Initialize() (/lib/x86_64-linux-gnu/libtcmalloc.so.4+0x2a6f0) (BuildId: eeef3d1257388a806e122398dbce3157ee568ef4)
+
+ SUMMARY: AddressSanitizer: 45 byte(s) leaked in 1 allocation(s).
```
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
tchaikov
added a commit
that referenced
this pull request
Sep 7, 2025
Replace unsafe string construction with bufferlist::length() to avoid reading beyond buffer boundaries. In commit 92ccbff, we introduced a bug when checking if a digest was empty by constructing a std::string from bufferlist: ``` std::string(digest.second.c_str()).empty() ``` This is unsafe because bufferlist data is not guaranteed to be null- terminated. The std::string constructor searches for a null terminator and may read beyond the bufferlist's allocated memory, causing a heap-buffer-overflow detected by AddressSanitizer: ``` ==66092==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7e0c65215004 at pc 0x7fbc6e27c597 bp 0x7ffe29fb6100 sp 0x7ffe29fb58b8 READ of size 5 at 0x7e0c65215004 thread T0 #0 0x7fbc6e27c596 in strlen /usr/src/debug/gcc/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:425 #1 0x562c75fad91a in std::char_traits<char>::length(char const*) /usr/include/c++/15.2.1/bits/char_traits.h:393 #2 0x562c75fb4222 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string<std::allocator<char> >(char const*, std::allocator<char> const&) /usr/include/c++/15.2.1/bits/b asic_string.h:713 #3 0x562c761b81ae in operator() /home/kefu/dev/ceph/src/osd/scrubber/scrub_backend.cc:1300 #4 0x562c761d7d53 in operator()<mini_flat_map<shard_id_t, ceph::buffer::v15_2_0::list, signed char>::_iterator<false> > /usr/include/c++/15.2.1/bits/predefined_ops.h:318 #5 0x562c761d789c in __find_if<mini_flat_map<shard_id_t, ceph::buffer::v15_2_0::list, signed char>::_iterator<false>, __gnu_cxx::__ops::_Iter_pred<ScrubBackend::match_in_shards(const hobject_t&, auth_selection_ t&, inconsistent_obj_wrapper&, std::stringstream&)::<lambda(const std::pair<const shard_id_t, ceph::buffer::v15_2_0::list&>&)> > > /usr/include/c++/15.2.1/bits/stl_algobase.h:2095 #6 0x562c761d72b2 in find_if<mini_flat_map<shard_id_t, ceph::buffer::v15_2_0::list, signed char>::_iterator<false>, ScrubBackend::match_in_shards(const hobject_t&, auth_selection_t&, inconsistent_obj_wrapper&, std::stringstream&)::<lambda(const std::pair<const shard_id_t, ceph::buffer::v15_2_0::list&>&)> > /usr/include/c++/15.2.1/bits/stl_algo.h:3921 #7 0x562c761d5f6f in none_of<mini_flat_map<shard_id_t, ceph::buffer::v15_2_0::list, signed char>::_iterator<false>, ScrubBackend::match_in_shards(const hobject_t&, auth_selection_t&, inconsistent_obj_wrapper&, std::stringstream&)::<lambda(const std::pair<const shard_id_t, ceph::buffer::v15_2_0::list&>&)> > /usr/include/c++/15.2.1/bits/stl_algo.h:431 #8 0x562c761d4a50 in any_of<mini_flat_map<shard_id_t, ceph::buffer::v15_2_0::list, signed char>::_iterator<false>, ScrubBackend::match_in_shards(const hobject_t&, auth_selection_t&, inconsistent_obj_wrapper&, s td::stringstream&)::<lambda(const std::pair<const shard_id_t, ceph::buffer::v15_2_0::list&>&)> > /usr/include/c++/15.2.1/bits/stl_algo.h:450 #9 0x562c761bb84b in ScrubBackend::match_in_shards(hobject_t const&, auth_selection_t&, inconsistent_obj_wrapper&, std::__cxx11::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >&) /home/k efu/dev/ceph/src/osd/scrubber/scrub_backend.cc:1297 #10 0x562c761b4282 in ScrubBackend::compare_obj_in_maps[abi:cxx11](hobject_t const&) /home/kefu/dev/ceph/src/osd/scrubber/scrub_backend.cc:941 #11 0x562c761d44af in operator()<hobject_t> /home/kefu/dev/ceph/src/osd/scrubber/scrub_backend.cc:887 #12 0x562c761d4836 in for_each<std::_Rb_tree_const_iterator<hobject_t>, ScrubBackend::compare_smaps()::<lambda(const auto:422&)> > /usr/include/c++/15.2.1/bits/stl_algo.h:3798 #13 0x562c761b3259 in ScrubBackend::compare_smaps() /home/kefu/dev/ceph/src/osd/scrubber/scrub_backend.cc:884 #14 0x562c761a478d in ScrubBackend::update_authoritative() /home/kefu/dev/ceph/src/osd/scrubber/scrub_backend.cc:315` ``` Fix by using bufferlist::length() which tells if the given buffer is empty instead of converting the buffer content to a string. Signed-off-by: Kefu Chai <tchaikov@gmail.com>
Aequitosh
added a commit
to Aequitosh/ceph
that referenced
this pull request
Sep 12, 2025
Currently, *all* MGRs collectively segfault on Ceph v19.2.3 running on Debian Trixie if a client requests the removal of an RBD image from the RBD trash (ceph#6635 [0]). After a lot of investigation, the cause of this still isn't clear to me; the most likely culprit are some internal changes to Python sub-interpreters that happened between Python versions 3.12 and 3.13. What leads me to this conclusion is the following: 1. A user on our forum noted [1] that the issue disappeared as soon as they set up a Ceph MGR inside a Debian Bookworm VM. Bookworm has Python version 3.11, which is the version before any substantial changes to sub-interpreters [2][3] were made. 2. There is an upstream issue [4] regarding another segfault during MGR startup. The author concluded that this problem is related to sub-interpreters and opened another issue [5] on Python's issue tracker that goes into more detail. Even though this is for a completely different code path, it shows that issues related to sub-interpreters are popping up elsewhere at the very least. 3. The segfault happens *inside* the Python interpreter: #0 0x000078e04d89e95c __pthread_kill_implementation (libc.so.6 + 0x9495c) ceph#1 0x000078e04d849cc2 __GI_raise (libc.so.6 + 0x3fcc2) ceph#2 0x00005ab95de92658 reraise_fatal (/usr/bin/ceph-mgr + 0x32d658) ceph#3 0x000078e04d849df0 __restore_rt (libc.so.6 + 0x3fdf0) ceph#4 0x000078e04ef598b0 _Py_dict_lookup (libpython3.13.so.1.0 + 0x1598b0) ceph#5 0x000078e04efa1843 _PyDict_GetItemRef_KnownHash (libpython3.13.so.1.0 + 0x1a1843) ceph#6 0x000078e04efa1af5 _PyType_LookupRef (libpython3.13.so.1.0 + 0x1a1af5) ceph#7 0x000078e04efa216b _Py_type_getattro_impl (libpython3.13.so.1.0 + 0x1a216b) ceph#8 0x000078e04ef6f60d PyObject_GetAttr (libpython3.13.so.1.0 + 0x16f60d) ceph#9 0x000078e04f043f20 _PyEval_EvalFrameDefault (libpython3.13.so.1.0 + 0x243f20) ceph#10 0x000078e04ef109dd _PyObject_VectorcallTstate (libpython3.13.so.1.0 + 0x1109dd) ceph#11 0x000078e04f1d3442 _PyObject_VectorcallTstate (libpython3.13.so.1.0 + 0x3d3442) ceph#12 0x000078e03b74ffed __pyx_f_3rbd_progress_callback (rbd.cpython-313-x86_64-linux-gnu.so + 0xacfed) ceph#13 0x000078e03afcc8af _ZN6librbd19AsyncObjectThrottleINS_8ImageCtxEE13start_next_opEv (librbd.so.1 + 0x3cc8af) ceph#14 0x000078e03afccfed _ZN6librbd19AsyncObjectThrottleINS_8ImageCtxEE9start_opsEm (librbd.so.1 + 0x3ccfed) ceph#15 0x000078e03afafec6 _ZN6librbd9operation11TrimRequestINS_8ImageCtxEE19send_remove_objectsEv (librbd.so.1 + 0x3afec6) ceph#16 0x000078e03afb0560 _ZN6librbd9operation11TrimRequestINS_8ImageCtxEE19send_copyup_objectsEv (librbd.so.1 + 0x3b0560) ceph#17 0x000078e03afb2e16 _ZN6librbd9operation11TrimRequestINS_8ImageCtxEE15should_completeEi (librbd.so.1 + 0x3b2e16) ceph#18 0x000078e03afae379 _ZN6librbd12AsyncRequestINS_8ImageCtxEE8completeEi (librbd.so.1 + 0x3ae379) ceph#19 0x000078e03ada8c70 _ZN7Context8completeEi (librbd.so.1 + 0x1a8c70) ceph#20 0x000078e03afcdb1e _ZN7Context8completeEi (librbd.so.1 + 0x3cdb1e) ceph#21 0x000078e04d6e4716 _ZN8librados14CB_AioCompleteclEv (librados.so.2 + 0xd2716) ceph#22 0x000078e04d6e5705 _ZN5boost4asio6detail19scheduler_operation8completeEPvRKNS_6system10error_codeEm (librados.so.2 + 0xd3705) ceph#23 0x000078e04d6e5f8a _ZN5boost4asio19asio_handler_invokeINS0_6detail23strand_executor_service7invokerIKNS0_10io_context19basic_executor_typeISaIvELm0EEEvEEEEvRT_z (librados.so.2 + 0xd3f8a) ceph#24 0x000078e04d6fc598 _ZN5boost4asio6detail19scheduler_operation8completeEPvRKNS_6system10error_codeEm (librados.so.2 + 0xea598) ceph#25 0x000078e04d6e9a71 _ZN5boost4asio6detail9scheduler3runERNS_6system10error_codeE (librados.so.2 + 0xd7a71) ceph#26 0x000078e04d6fff63 _ZN5boost4asio10io_context3runEv (librados.so.2 + 0xedf63) ceph#27 0x000078e04dae1224 n/a (libstdc++.so.6 + 0xe1224) ceph#28 0x000078e04d89cb7b start_thread (libc.so.6 + 0x92b7b) ceph#29 0x000078e04d91a7b8 __clone3 (libc.so.6 + 0x1107b8) Note that in ceph#12, you can see that a "progress callback" is being called by librbd. This callback is a plain Python function that is passed down via Ceph's Python/C++ bindings for librbd [6]. (I'd provide more stack traces for the other threads here, but they're rather massive.) Then, from ceph#11 to ceph#4 the entire execution happens within the Python interpreter: This is just the callback being executed. The segfault happens at ceph#4 during _Py_dict_lookup(), which is a private function inside the Python interpreter to look something up in a `dict` [7]. As this function is so fundamental, it shouldn't ever fail, ever; but yet it does, which suggests that some internal interpreter state is most likely corrupted at that point. Since it's incredibly hard to debug and actually figure out what the *real* underlying issue is, simply disable that on_progress callback instead. I just hope that this doesn't move the problem somewhere else. Unless I'm mistaken, there aren't any other callbacks that get passed through C/C++ via Cython [8] like this, so this should hopefully prevent any further SIGSEGVs until this is fixed upstream (somehow). Note that this bug was also reported upstream [9]. [0]: https://bugzilla.proxmox.com/show_bug.cgi?id=6635 [1]: https://forum.proxmox.com/threads/ceph-managers-seg-faulting-post-upgrade-8-9-upgrade.169363/post-796315 [2]: https://docs.python.org/3.12/whatsnew/3.12.html#pep-684-a-per-interpreter-gil [3]: python/cpython#117953 [4]: https://tracker.ceph.com/issues/67696 [5]: python/cpython#138045 [6]: https://github.com/ceph/ceph/blob/c92aebb279828e9c3c1f5d24613efca272649e62/src/pybind/rbd/rbd.pyx#L878-L907 [7]: https://github.com/python/cpython/blob/282bd0fe98bf1c3432fd5a079ecf65f165a52587/Objects/dictobject.c#L1262-L1278 [8]: https://cython.org/ [9]: https://tracker.ceph.com/issues/72713 Fixes: ceph#6635 Signed-off-by: Max R. Carrara <m.carrara@proxmox.com>
rzarzynski
pushed a commit
to rzarzynski/ceph
that referenced
this pull request
Sep 17, 2025
Replace unsafe string construction with bufferlist::length() to avoid reading beyond buffer boundaries. In commit 92ccbff, we introduced a bug when checking if a digest was empty by constructing a std::string from bufferlist: ``` std::string(digest.second.c_str()).empty() ``` This is unsafe because bufferlist data is not guaranteed to be null- terminated. The std::string constructor searches for a null terminator and may read beyond the bufferlist's allocated memory, causing a heap-buffer-overflow detected by AddressSanitizer: ``` ==66092==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7e0c65215004 at pc 0x7fbc6e27c597 bp 0x7ffe29fb6100 sp 0x7ffe29fb58b8 READ of size 5 at 0x7e0c65215004 thread T0 #0 0x7fbc6e27c596 in strlen /usr/src/debug/gcc/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:425 #1 0x562c75fad91a in std::char_traits<char>::length(char const*) /usr/include/c++/15.2.1/bits/char_traits.h:393 #2 0x562c75fb4222 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string<std::allocator<char> >(char const*, std::allocator<char> const&) /usr/include/c++/15.2.1/bits/b asic_string.h:713 ceph#3 0x562c761b81ae in operator() /home/kefu/dev/ceph/src/osd/scrubber/scrub_backend.cc:1300 ceph#4 0x562c761d7d53 in operator()<mini_flat_map<shard_id_t, ceph::buffer::v15_2_0::list, signed char>::_iterator<false> > /usr/include/c++/15.2.1/bits/predefined_ops.h:318 ceph#5 0x562c761d789c in __find_if<mini_flat_map<shard_id_t, ceph::buffer::v15_2_0::list, signed char>::_iterator<false>, __gnu_cxx::__ops::_Iter_pred<ScrubBackend::match_in_shards(const hobject_t&, auth_selection_ t&, inconsistent_obj_wrapper&, std::stringstream&)::<lambda(const std::pair<const shard_id_t, ceph::buffer::v15_2_0::list&>&)> > > /usr/include/c++/15.2.1/bits/stl_algobase.h:2095 ceph#6 0x562c761d72b2 in find_if<mini_flat_map<shard_id_t, ceph::buffer::v15_2_0::list, signed char>::_iterator<false>, ScrubBackend::match_in_shards(const hobject_t&, auth_selection_t&, inconsistent_obj_wrapper&, std::stringstream&)::<lambda(const std::pair<const shard_id_t, ceph::buffer::v15_2_0::list&>&)> > /usr/include/c++/15.2.1/bits/stl_algo.h:3921 ceph#7 0x562c761d5f6f in none_of<mini_flat_map<shard_id_t, ceph::buffer::v15_2_0::list, signed char>::_iterator<false>, ScrubBackend::match_in_shards(const hobject_t&, auth_selection_t&, inconsistent_obj_wrapper&, std::stringstream&)::<lambda(const std::pair<const shard_id_t, ceph::buffer::v15_2_0::list&>&)> > /usr/include/c++/15.2.1/bits/stl_algo.h:431 ceph#8 0x562c761d4a50 in any_of<mini_flat_map<shard_id_t, ceph::buffer::v15_2_0::list, signed char>::_iterator<false>, ScrubBackend::match_in_shards(const hobject_t&, auth_selection_t&, inconsistent_obj_wrapper&, s td::stringstream&)::<lambda(const std::pair<const shard_id_t, ceph::buffer::v15_2_0::list&>&)> > /usr/include/c++/15.2.1/bits/stl_algo.h:450 ceph#9 0x562c761bb84b in ScrubBackend::match_in_shards(hobject_t const&, auth_selection_t&, inconsistent_obj_wrapper&, std::__cxx11::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >&) /home/k efu/dev/ceph/src/osd/scrubber/scrub_backend.cc:1297 ceph#10 0x562c761b4282 in ScrubBackend::compare_obj_in_maps[abi:cxx11](hobject_t const&) /home/kefu/dev/ceph/src/osd/scrubber/scrub_backend.cc:941 ceph#11 0x562c761d44af in operator()<hobject_t> /home/kefu/dev/ceph/src/osd/scrubber/scrub_backend.cc:887 ceph#12 0x562c761d4836 in for_each<std::_Rb_tree_const_iterator<hobject_t>, ScrubBackend::compare_smaps()::<lambda(const auto:422&)> > /usr/include/c++/15.2.1/bits/stl_algo.h:3798 ceph#13 0x562c761b3259 in ScrubBackend::compare_smaps() /home/kefu/dev/ceph/src/osd/scrubber/scrub_backend.cc:884 ceph#14 0x562c761a478d in ScrubBackend::update_authoritative() /home/kefu/dev/ceph/src/osd/scrubber/scrub_backend.cc:315` ``` Fix by using bufferlist::length() which tells if the given buffer is empty instead of converting the buffer content to a string. Signed-off-by: Kefu Chai <tchaikov@gmail.com> (cherry picked from commit 100c20b)
ThomasLamprecht
pushed a commit
to ThomasLamprecht/ceph
that referenced
this pull request
Sep 17, 2025
Currently, *all* MGRs collectively segfault on Ceph v19.2.3 running on Debian Trixie if a client requests the removal of an RBD image from the RBD trash (ceph#6635 [0]). After a lot of investigation, the cause of this still isn't clear to me; the most likely culprit are some internal changes to Python sub-interpreters that happened between Python versions 3.12 and 3.13. What leads me to this conclusion is the following: 1. A user on our forum noted [1] that the issue disappeared as soon as they set up a Ceph MGR inside a Debian Bookworm VM. Bookworm has Python version 3.11, which is the version before any substantial changes to sub-interpreters [2][3] were made. 2. There is an upstream issue [4] regarding another segfault during MGR startup. The author concluded that this problem is related to sub-interpreters and opened another issue [5] on Python's issue tracker that goes into more detail. Even though this is for a completely different code path, it shows that issues related to sub-interpreters are popping up elsewhere at the very least. 3. The segfault happens *inside* the Python interpreter: #0 0x000078e04d89e95c __pthread_kill_implementation (libc.so.6 + 0x9495c) ceph#1 0x000078e04d849cc2 __GI_raise (libc.so.6 + 0x3fcc2) ceph#2 0x00005ab95de92658 reraise_fatal (/usr/bin/ceph-mgr + 0x32d658) ceph#3 0x000078e04d849df0 __restore_rt (libc.so.6 + 0x3fdf0) ceph#4 0x000078e04ef598b0 _Py_dict_lookup (libpython3.13.so.1.0 + 0x1598b0) ceph#5 0x000078e04efa1843 _PyDict_GetItemRef_KnownHash (libpython3.13.so.1.0 + 0x1a1843) ceph#6 0x000078e04efa1af5 _PyType_LookupRef (libpython3.13.so.1.0 + 0x1a1af5) ceph#7 0x000078e04efa216b _Py_type_getattro_impl (libpython3.13.so.1.0 + 0x1a216b) ceph#8 0x000078e04ef6f60d PyObject_GetAttr (libpython3.13.so.1.0 + 0x16f60d) ceph#9 0x000078e04f043f20 _PyEval_EvalFrameDefault (libpython3.13.so.1.0 + 0x243f20) ceph#10 0x000078e04ef109dd _PyObject_VectorcallTstate (libpython3.13.so.1.0 + 0x1109dd) ceph#11 0x000078e04f1d3442 _PyObject_VectorcallTstate (libpython3.13.so.1.0 + 0x3d3442) ceph#12 0x000078e03b74ffed __pyx_f_3rbd_progress_callback (rbd.cpython-313-x86_64-linux-gnu.so + 0xacfed) ceph#13 0x000078e03afcc8af _ZN6librbd19AsyncObjectThrottleINS_8ImageCtxEE13start_next_opEv (librbd.so.1 + 0x3cc8af) ceph#14 0x000078e03afccfed _ZN6librbd19AsyncObjectThrottleINS_8ImageCtxEE9start_opsEm (librbd.so.1 + 0x3ccfed) ceph#15 0x000078e03afafec6 _ZN6librbd9operation11TrimRequestINS_8ImageCtxEE19send_remove_objectsEv (librbd.so.1 + 0x3afec6) ceph#16 0x000078e03afb0560 _ZN6librbd9operation11TrimRequestINS_8ImageCtxEE19send_copyup_objectsEv (librbd.so.1 + 0x3b0560) ceph#17 0x000078e03afb2e16 _ZN6librbd9operation11TrimRequestINS_8ImageCtxEE15should_completeEi (librbd.so.1 + 0x3b2e16) ceph#18 0x000078e03afae379 _ZN6librbd12AsyncRequestINS_8ImageCtxEE8completeEi (librbd.so.1 + 0x3ae379) ceph#19 0x000078e03ada8c70 _ZN7Context8completeEi (librbd.so.1 + 0x1a8c70) ceph#20 0x000078e03afcdb1e _ZN7Context8completeEi (librbd.so.1 + 0x3cdb1e) ceph#21 0x000078e04d6e4716 _ZN8librados14CB_AioCompleteclEv (librados.so.2 + 0xd2716) ceph#22 0x000078e04d6e5705 _ZN5boost4asio6detail19scheduler_operation8completeEPvRKNS_6system10error_codeEm (librados.so.2 + 0xd3705) ceph#23 0x000078e04d6e5f8a _ZN5boost4asio19asio_handler_invokeINS0_6detail23strand_executor_service7invokerIKNS0_10io_context19basic_executor_typeISaIvELm0EEEvEEEEvRT_z (librados.so.2 + 0xd3f8a) ceph#24 0x000078e04d6fc598 _ZN5boost4asio6detail19scheduler_operation8completeEPvRKNS_6system10error_codeEm (librados.so.2 + 0xea598) ceph#25 0x000078e04d6e9a71 _ZN5boost4asio6detail9scheduler3runERNS_6system10error_codeE (librados.so.2 + 0xd7a71) ceph#26 0x000078e04d6fff63 _ZN5boost4asio10io_context3runEv (librados.so.2 + 0xedf63) ceph#27 0x000078e04dae1224 n/a (libstdc++.so.6 + 0xe1224) ceph#28 0x000078e04d89cb7b start_thread (libc.so.6 + 0x92b7b) ceph#29 0x000078e04d91a7b8 __clone3 (libc.so.6 + 0x1107b8) Note that in ceph#12, you can see that a "progress callback" is being called by librbd. This callback is a plain Python function that is passed down via Ceph's Python/C++ bindings for librbd [6]. (I'd provide more stack traces for the other threads here, but they're rather massive.) Then, from ceph#11 to ceph#4 the entire execution happens within the Python interpreter: This is just the callback being executed. The segfault happens at ceph#4 during _Py_dict_lookup(), which is a private function inside the Python interpreter to look something up in a `dict` [7]. As this function is so fundamental, it shouldn't ever fail, ever; but yet it does, which suggests that some internal interpreter state is most likely corrupted at that point. Since it's incredibly hard to debug and actually figure out what the *real* underlying issue is, simply disable that on_progress callback instead. I just hope that this doesn't move the problem somewhere else. Unless I'm mistaken, there aren't any other callbacks that get passed through C/C++ via Cython [8] like this, so this should hopefully prevent any further SIGSEGVs until this is fixed upstream (somehow). Note that this bug was also reported upstream [9]. [0]: https://bugzilla.proxmox.com/show_bug.cgi?id=6635 [1]: https://forum.proxmox.com/threads/ceph-managers-seg-faulting-post-upgrade-8-9-upgrade.169363/post-796315 [2]: https://docs.python.org/3.12/whatsnew/3.12.html#pep-684-a-per-interpreter-gil [3]: python/cpython#117953 [4]: https://tracker.ceph.com/issues/67696 [5]: python/cpython#138045 [6]: https://github.com/ceph/ceph/blob/c92aebb279828e9c3c1f5d24613efca272649e62/src/pybind/rbd/rbd.pyx#L878-L907 [7]: https://github.com/python/cpython/blob/282bd0fe98bf1c3432fd5a079ecf65f165a52587/Objects/dictobject.c#L1262-L1278 [8]: https://cython.org/ [9]: https://tracker.ceph.com/issues/72713 Fixes: ceph#6635 Signed-off-by: Max R. Carrara <m.carrara@proxmox.com> Link: https://lore.proxmox.com/20250910085244.123467-1-m.carrara@proxmox.com
tchaikov
added a commit
that referenced
this pull request
Oct 21, 2025
…ives
Add suppression rules for two categories of false positive warnings
encountered during ASan-enabled testing:
1. PyModule_ExecDef memory leaks: ASan incorrectly interprets Python's
module loading behavior as memory leaks when the interpreter loads
extension modules.
2. __cxa_throw interception failures: ASan's interceptor cannot properly
intercept exception handling when libstdc++.so is loaded after the
ASan shared library, causing CHECK failures.
3. ErasureCodePluginRegistry::load:
`ceph::ErasureCodePluginRegistry::load()` is known to leak, as we
don't free the memory allocated by the ec plugins which are
registered in the `ErasureCodePluginRegistry` singleton. this is a
known issue, but since the `ErasureCodePluginRegistry` instance is a
singleton. we can live with it. in this change, we add the rule to
suppress the leak report from LeakSanitizer. this rule also exist in
qa/valgrind.supp.
All warnings are confirmed false positives that should be suppressed
to reduce noise in test output.
Example warnings:
```
Direct leak of 3264 byte(s) in 1 object(s) allocated from:
#0 0x7f6027d20cb5 in malloc /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_malloc_linux.cpp:67
#1 0x7f60277557ad (/usr/lib/libpython3.13.so.1.0+0x1557ad) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
#2 0x7f6027756067 (/usr/lib/libpython3.13.so.1.0+0x156067) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
#3 0x7f60278471a0 (/usr/lib/libpython3.13.so.1.0+0x2471a0) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
#4 0x7f602774d031 (/usr/lib/libpython3.13.so.1.0+0x14d031) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
#5 0x7b60234093bb in __Pyx_modinit_type_init_code.constprop.0 /home/kefu/dev/ceph/build/src/pybind/rados/rados.c:82066
#6 0x7b602340a826 in __pyx_pymod_exec_rados /home/kefu/dev/ceph/build/src/pybind/rados/rados.c:82755
#7 0x7f6027856777 in PyModule_ExecDef (/usr/lib/libpython3.13.so.1.0+0x256777) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
#8 0x7f602785baa3 (/usr/lib/libpython3.13.so.1.0+0x25baa3) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
#9 0x7f6027793df2 (/usr/lib/libpython3.13.so.1.0+0x193df2) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
#10 0x7f6027777cbe in _PyEval_EvalFrameDefault (/usr/lib/libpython3.13.so.1.0+0x177cbe) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
#11 0x7f60277957de (/usr/lib/libpython3.13.so.1.0+0x1957de) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
#12 0x7f60277d11b9 in PyObject_CallMethodObjArgs (/usr/lib/libpython3.13.so.1.0+0x1d11b9) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
#13 0x7f60277d0ee4 in PyImport_ImportModuleLevelObject (/usr/lib/libpython3.13.so.1.0+0x1d0ee4) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
#14 0x7f6027779c0c in _PyEval_EvalFrameDefault (/usr/lib/libpython3.13.so.1.0+0x179c0c) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
#15 0x7f602784e2c8 in PyEval_EvalCode (/usr/lib/libpython3.13.so.1.0+0x24e2c8) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
#16 0x7f602788c88b (/usr/lib/libpython3.13.so.1.0+0x28c88b) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
#17 0x7f602788985c (/usr/lib/libpython3.13.so.1.0+0x28985c) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
#18 0x7f6027886f57 (/usr/lib/libpython3.13.so.1.0+0x286f57) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
#19 0x7f6027886211 (/usr/lib/libpython3.13.so.1.0+0x286211) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
#20 0x7f6027885b82 (/usr/lib/libpython3.13.so.1.0+0x285b82) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
#21 0x7f6027883e50 in Py_RunMain (/usr/lib/libpython3.13.so.1.0+0x283e50) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
#22 0x7f602783bbea in Py_BytesMain (/usr/lib/libpython3.13.so.1.0+0x23bbea) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
#23 0x7f6027227674 (/usr/lib/libc.so.6+0x27674) (BuildId: 4fe011c94a88e8aeb6f2201b9eb369f42b4a1e9e)
#24 0x7f6027227728 in __libc_start_main (/usr/lib/libc.so.6+0x27728) (BuildId: 4fe011c94a88e8aeb6f2201b9eb369f42b4a1e9e)
#25 0x55dae17e6044 in _start (/usr/bin/python3.13+0x1044) (BuildId: 8c0dc848f5b978c56ebeb07255bb332b4b37ae4e)
```
```
AddressSanitizer: CHECK failed: asan_interceptors.cpp:335 "((__interception::real___cxa_throw)) != (0)" (0x0, 0x0) (tid=3246455)
#0 0x7f345ea81979 in CheckUnwind ../../../../src/libsanitizer/asan/asan_rtl.cpp:69
#1 0x7f345eaa790d in __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) ../../../../src/libsanitizer/sanitizer_common/sanitizer_termination.cpp:86
#2 0x7f345e9e1d54 in __interceptor___cxa_throw ../../../../src/libsanitizer/asan/asan_interceptors.cpp:335
#3 0x7f345e9e1d54 in __interceptor___cxa_throw ../../../../src/libsanitizer/asan/asan_interceptors.cpp:334
#4 0x7f3458623def in void boost::throw_exception<boost::bad_lexical_cast>(boost::bad_lexical_cast const&) /opt/ceph/include/boost/throw_exception.hpp:165
#5 0x7f345997ad3b in void boost::conversion::detail::throw_bad_cast<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, unsigned long>() /opt/ceph/include/boost/lexical_cast/bad_lexical_cast.hpp:93
#6 0x7f3459979d35 in unsigned long boost::lexical_cast<unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /opt/ceph/include/boost/lexical_cast.hpp:43`
```
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
abitdrag
pushed a commit
to abitdrag/ceph
that referenced
this pull request
Oct 21, 2025
Removing vxattr 'ceph.dir.subvolume' on a directory without it being set causes the mds to crash. This is because the snaprealm would be null for the directory and the null check is missing. Setting the vxattr, creates the snaprealm for the directory as part of it. Hence, mds doesn't crash when the vxattr is set and then removed. This patch fixes the same. Reproducer: $mkdir /mnt/dir1 $setfattr -x "ceph.dir.subvolume" /mnt/dir1 Traceback: ------- Core was generated by `./ceph/build/bin/ceph-mds -i a -c ./ceph/build/ceph.conf'. Program terminated with signal SIGSEGV, Segmentation fault. (gdb) bt #0 0x00007f33f1aa8034 in __pthread_kill_implementation () from /lib64/libc.so.6 ceph#1 0x00007f33f1a4ef1e in raise () from /lib64/libc.so.6 ceph#2 0x0000562b148a6fd0 in reraise_fatal (signum=signum@entry=11) at /ceph/src/global/signal_handler.cc:88 ceph#3 0x0000562b148a83d9 in handle_oneshot_fatal_signal (signum=11) at /ceph/src/global/signal_handler.cc:367 ceph#4 <signal handler called> ceph#5 Server::handle_client_setvxattr (this=0x562b4ee3f800, mdr=..., cur=0x562b4ef9cc00) at /ceph/src/mds/Server.cc:6406 ceph#6 0x0000562b145fadc2 in Server::handle_client_removexattr (this=0x562b4ee3f800, mdr=...) at /ceph/src/mds/Server.cc:7022 ceph#7 0x0000562b145fbff0 in Server::dispatch_client_request (this=0x562b4ee3f800, mdr=...) at /ceph/src/mds/Server.cc:2825 ceph#8 0x0000562b145fcfa2 in Server::handle_client_request (this=0x562b4ee3f800, req=...) at /ceph/src/mds/Server.cc:2676 ceph#9 0x0000562b1460063c in Server::dispatch (this=0x562b4ee3f800, m=...) at /ceph/src/mds/Server.cc:382 ceph#10 0x0000562b1450eb22 in MDSRank::handle_message (this=this@entry=0x562b4ef42008, m=...) at /ceph/src/mds/MDSRank.cc:1222 ceph#11 0x0000562b14510c93 in MDSRank::_dispatch (this=this@entry=0x562b4ef42008, m=..., new_msg=new_msg@entry=true) at /ceph/src/mds/MDSRank.cc:1045 ceph#12 0x0000562b14511620 in MDSRankDispatcher::ms_dispatch (this=this@entry=0x562b4ef42000, m=...) at /ceph/src/mds/MDSRank.cc:1019 ceph#13 0x0000562b144ff117 in MDSDaemon::ms_dispatch2 (this=0x562b4ee64000, m=...) at /ceph/src/common/RefCountedObj.h:56 ceph#14 0x00007f33f2f4974a in Messenger::ms_deliver_dispatch (this=0x562b4ee70000, m=...) at /ceph/src/msg/Messenger.h:746 ceph#15 0x00007f33f2f467e2 in DispatchQueue::entry (this=0x562b4ee703b8) at /ceph/src/msg/DispatchQueue.cc:202 ceph#16 0x00007f33f2ff61cb in DispatchQueue::DispatchThread::entry (this=<optimized out>) at /ceph/src/msg/DispatchQueue.h:101 ceph#17 0x00007f33f2df4b5d in Thread::entry_wrapper (this=0x562b4ee70518) at /ceph/src/common/Thread.cc:87 ceph#18 0x00007f33f2df4b6f in Thread::_entry_func (arg=<optimized out>) at /ceph/src/common/Thread.cc:74 ceph#19 0x00007f33f1aa6088 in start_thread () from /lib64/libc.so.6 ceph#20 0x00007f33f1b29f8c in clone3 () from /lib64/libc.so.6 --------- Fixes: https://tracker.ceph.com/issues/70794 Signed-off-by: Kotresh HR <khiremat@redhat.com>
abitdrag
pushed a commit
to abitdrag/ceph
that referenced
this pull request
Oct 21, 2025
Fix a memory leak where HybridAllocator's embedded BitmapAllocator
(bmap_alloc) was not properly cleaned up during destruction. The issue
occurred because virtual function calls in destructors don't dispatch
to derived class implementations - when AvlAllocator::~AvlAllocator()
calls shutdown(), it invokes AvlAllocator::shutdown() rather than
HybridAllocatorBase::shutdown(), leaving bmap_alloc unfreed.
This issue only affects unit tests and does not impact production,
as BlueFS always calls shutdown() explicitly when stopping allocators
in BlueFS::_stop_alloc()
This change has minimal impact on existing code that already calls
shutdown() explicitly, since shutdown() has idempotent behavior and
can be safely called multiple times. Additionally, destructors are
only invoked during system teardown, so the potential double shutdown()
call does not affect runtime performance.
Changes:
- Replace raw pointer bmap_alloc with unique_ptr for automatic cleanup
- Add explicit shutdown() call in BitmapAllocator destructor for RAII
- Replace manual bmap_alloc->shutdown() with bmap_alloc.reset()
- Add explicit default destructor to HybridAllocatorBase for clarity
This eliminates the need for manual shutdown() calls and ensures
proper resource cleanup through RAII principles.
Fixes ASan-reported indirect leak of BitmapAllocator resources.
Please see the leak reported by ASan for more details:
```
Indirect leak of 23 byte(s) in 1 object(s) allocated from:
#0 0x7fe30b121a2d in operator new(unsigned long) /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_new_delete.cpp:86
ceph#1 0x5614e057ee11 in std::__new_allocator<char>::allocate(unsigned long, void const*) /usr/include/c++/15.1.1/bits/new_allocator.h:151
ceph#2 0x5614e057d279 in std::allocator<char>::allocate(unsigned long) /usr/include/c++/15.1.1/bits/allocator.h:203
ceph#3 0x5614e057d279 in std::allocator_traits<std::allocator<char> >::allocate(std::allocator<char>&, unsigned long) /usr/include/c++/15.1.1/bits/alloc_traits.h:614
ceph#4 0x5614e057d279 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_S_allocate(std::allocator<char>&, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.h:142
ceph#5 0x5614e057cf27 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_create(unsigned long&, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.tcc:164
ceph#6 0x5614e05fe91b in void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<true>(char const*, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.tcc:291
ceph#7 0x5614e05f3cd4 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /usr/include/c++/15.1.
1/bits/basic_string.h:617
ceph#8 0x5614e069b892 in ceph::mutex_debug_detail::mutex_debug_impl<false>::mutex_debug_impl(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, bool) /home/kefu/dev/ceph/src/common/mutex_debug.
h:103
ceph#9 0x5614e06b8586 in ceph::mutex_debug_detail::mutex_debug_impl<false> ceph::make_mutex<char const (&) [23]>(char const (&) [23]) /home/kefu/dev/ceph/src/common/ceph_mutex.h:118
ceph#10 0x5614e06b8350 in AllocatorLevel02<AllocatorLevel01Loose>::AllocatorLevel02() /home/kefu/dev/ceph/src/os/bluestore/fastbmap_allocator_impl.h:526
ceph#11 0x5614e06b393b in BitmapAllocator::BitmapAllocator(ceph::common::CephContext*, long, long, std::basic_string_view<char, std::char_traits<char> >) /home/kefu/dev/ceph/src/os/bluestore/BitmapAllocator.cc:16
ceph#12 0x5614e072755f in HybridAllocatorBase<AvlAllocator>::_spillover_range(unsigned long, unsigned long) /home/kefu/dev/ceph/src/os/bluestore/HybridAllocator_impl.h:123
ceph#13 0x5614e06c9045 in AvlAllocator::_try_insert_range(unsigned long, unsigned long, boost::intrusive::tree_iterator<boost::intrusive::mhtraits<range_seg_t, boost::intrusive::avl_set_member_hook<>, &range_seg_t::offset_hook>,
false>*) /home/kefu/dev/ceph/src/os/bluestore/AvlAllocator.h:210
ceph#14 0x5614e06c0104 in AvlAllocator::_add_to_tree(unsigned long, unsigned long) /home/kefu/dev/ceph/src/os/bluestore/AvlAllocator.cc:136
ceph#15 0x5614e0586e5f in HybridAllocatorBase<AvlAllocator>::_add_to_tree(unsigned long, unsigned long) /home/kefu/dev/ceph/src/os/bluestore/HybridAllocator.h:110
ceph#16 0x5614e06c6559 in AvlAllocator::init_add_free(unsigned long, unsigned long) /home/kefu/dev/ceph/src/os/bluestore/AvlAllocator.cc:494
ceph#17 0x5614e0582943 in HybridAllocator_fragmentation_Test::TestBody() /home/kefu/dev/ceph/src/test/objectstore/hybrid_allocator_test.cc:285
ceph#18 0x5614e061bee3 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2653
ceph#19 0x5614e0606562 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2689`
```
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
connorfawcett
pushed a commit
to connorfawcett/ceph
that referenced
this pull request
Oct 23, 2025
…ives
Add suppression rules for two categories of false positive warnings
encountered during ASan-enabled testing:
1. PyModule_ExecDef memory leaks: ASan incorrectly interprets Python's
module loading behavior as memory leaks when the interpreter loads
extension modules.
2. __cxa_throw interception failures: ASan's interceptor cannot properly
intercept exception handling when libstdc++.so is loaded after the
ASan shared library, causing CHECK failures.
3. ErasureCodePluginRegistry::load:
`ceph::ErasureCodePluginRegistry::load()` is known to leak, as we
don't free the memory allocated by the ec plugins which are
registered in the `ErasureCodePluginRegistry` singleton. this is a
known issue, but since the `ErasureCodePluginRegistry` instance is a
singleton. we can live with it. in this change, we add the rule to
suppress the leak report from LeakSanitizer. this rule also exist in
qa/valgrind.supp.
All warnings are confirmed false positives that should be suppressed
to reduce noise in test output.
Example warnings:
```
Direct leak of 3264 byte(s) in 1 object(s) allocated from:
#0 0x7f6027d20cb5 in malloc /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_malloc_linux.cpp:67
ceph#1 0x7f60277557ad (/usr/lib/libpython3.13.so.1.0+0x1557ad) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#2 0x7f6027756067 (/usr/lib/libpython3.13.so.1.0+0x156067) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#3 0x7f60278471a0 (/usr/lib/libpython3.13.so.1.0+0x2471a0) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#4 0x7f602774d031 (/usr/lib/libpython3.13.so.1.0+0x14d031) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#5 0x7b60234093bb in __Pyx_modinit_type_init_code.constprop.0 /home/kefu/dev/ceph/build/src/pybind/rados/rados.c:82066
ceph#6 0x7b602340a826 in __pyx_pymod_exec_rados /home/kefu/dev/ceph/build/src/pybind/rados/rados.c:82755
ceph#7 0x7f6027856777 in PyModule_ExecDef (/usr/lib/libpython3.13.so.1.0+0x256777) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#8 0x7f602785baa3 (/usr/lib/libpython3.13.so.1.0+0x25baa3) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#9 0x7f6027793df2 (/usr/lib/libpython3.13.so.1.0+0x193df2) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#10 0x7f6027777cbe in _PyEval_EvalFrameDefault (/usr/lib/libpython3.13.so.1.0+0x177cbe) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#11 0x7f60277957de (/usr/lib/libpython3.13.so.1.0+0x1957de) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#12 0x7f60277d11b9 in PyObject_CallMethodObjArgs (/usr/lib/libpython3.13.so.1.0+0x1d11b9) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#13 0x7f60277d0ee4 in PyImport_ImportModuleLevelObject (/usr/lib/libpython3.13.so.1.0+0x1d0ee4) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#14 0x7f6027779c0c in _PyEval_EvalFrameDefault (/usr/lib/libpython3.13.so.1.0+0x179c0c) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#15 0x7f602784e2c8 in PyEval_EvalCode (/usr/lib/libpython3.13.so.1.0+0x24e2c8) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#16 0x7f602788c88b (/usr/lib/libpython3.13.so.1.0+0x28c88b) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#17 0x7f602788985c (/usr/lib/libpython3.13.so.1.0+0x28985c) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#18 0x7f6027886f57 (/usr/lib/libpython3.13.so.1.0+0x286f57) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#19 0x7f6027886211 (/usr/lib/libpython3.13.so.1.0+0x286211) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#20 0x7f6027885b82 (/usr/lib/libpython3.13.so.1.0+0x285b82) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#21 0x7f6027883e50 in Py_RunMain (/usr/lib/libpython3.13.so.1.0+0x283e50) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#22 0x7f602783bbea in Py_BytesMain (/usr/lib/libpython3.13.so.1.0+0x23bbea) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#23 0x7f6027227674 (/usr/lib/libc.so.6+0x27674) (BuildId: 4fe011c94a88e8aeb6f2201b9eb369f42b4a1e9e)
ceph#24 0x7f6027227728 in __libc_start_main (/usr/lib/libc.so.6+0x27728) (BuildId: 4fe011c94a88e8aeb6f2201b9eb369f42b4a1e9e)
ceph#25 0x55dae17e6044 in _start (/usr/bin/python3.13+0x1044) (BuildId: 8c0dc848f5b978c56ebeb07255bb332b4b37ae4e)
```
```
AddressSanitizer: CHECK failed: asan_interceptors.cpp:335 "((__interception::real___cxa_throw)) != (0)" (0x0, 0x0) (tid=3246455)
#0 0x7f345ea81979 in CheckUnwind ../../../../src/libsanitizer/asan/asan_rtl.cpp:69
ceph#1 0x7f345eaa790d in __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) ../../../../src/libsanitizer/sanitizer_common/sanitizer_termination.cpp:86
ceph#2 0x7f345e9e1d54 in __interceptor___cxa_throw ../../../../src/libsanitizer/asan/asan_interceptors.cpp:335
ceph#3 0x7f345e9e1d54 in __interceptor___cxa_throw ../../../../src/libsanitizer/asan/asan_interceptors.cpp:334
ceph#4 0x7f3458623def in void boost::throw_exception<boost::bad_lexical_cast>(boost::bad_lexical_cast const&) /opt/ceph/include/boost/throw_exception.hpp:165
ceph#5 0x7f345997ad3b in void boost::conversion::detail::throw_bad_cast<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, unsigned long>() /opt/ceph/include/boost/lexical_cast/bad_lexical_cast.hpp:93
ceph#6 0x7f3459979d35 in unsigned long boost::lexical_cast<unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /opt/ceph/include/boost/lexical_cast.hpp:43`
```
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
harsimran-05
pushed a commit
to harsimran-05/ceph
that referenced
this pull request
Oct 27, 2025
Replace unsafe string construction with bufferlist::length() to avoid reading beyond buffer boundaries. In commit 92ccbff, we introduced a bug when checking if a digest was empty by constructing a std::string from bufferlist: ``` std::string(digest.second.c_str()).empty() ``` This is unsafe because bufferlist data is not guaranteed to be null- terminated. The std::string constructor searches for a null terminator and may read beyond the bufferlist's allocated memory, causing a heap-buffer-overflow detected by AddressSanitizer: ``` ==66092==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7e0c65215004 at pc 0x7fbc6e27c597 bp 0x7ffe29fb6100 sp 0x7ffe29fb58b8 READ of size 5 at 0x7e0c65215004 thread T0 #0 0x7fbc6e27c596 in strlen /usr/src/debug/gcc/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:425 #1 0x562c75fad91a in std::char_traits<char>::length(char const*) /usr/include/c++/15.2.1/bits/char_traits.h:393 #2 0x562c75fb4222 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string<std::allocator<char> >(char const*, std::allocator<char> const&) /usr/include/c++/15.2.1/bits/b asic_string.h:713 #3 0x562c761b81ae in operator() /home/kefu/dev/ceph/src/osd/scrubber/scrub_backend.cc:1300 ceph#4 0x562c761d7d53 in operator()<mini_flat_map<shard_id_t, ceph::buffer::v15_2_0::list, signed char>::_iterator<false> > /usr/include/c++/15.2.1/bits/predefined_ops.h:318 ceph#5 0x562c761d789c in __find_if<mini_flat_map<shard_id_t, ceph::buffer::v15_2_0::list, signed char>::_iterator<false>, __gnu_cxx::__ops::_Iter_pred<ScrubBackend::match_in_shards(const hobject_t&, auth_selection_ t&, inconsistent_obj_wrapper&, std::stringstream&)::<lambda(const std::pair<const shard_id_t, ceph::buffer::v15_2_0::list&>&)> > > /usr/include/c++/15.2.1/bits/stl_algobase.h:2095 ceph#6 0x562c761d72b2 in find_if<mini_flat_map<shard_id_t, ceph::buffer::v15_2_0::list, signed char>::_iterator<false>, ScrubBackend::match_in_shards(const hobject_t&, auth_selection_t&, inconsistent_obj_wrapper&, std::stringstream&)::<lambda(const std::pair<const shard_id_t, ceph::buffer::v15_2_0::list&>&)> > /usr/include/c++/15.2.1/bits/stl_algo.h:3921 ceph#7 0x562c761d5f6f in none_of<mini_flat_map<shard_id_t, ceph::buffer::v15_2_0::list, signed char>::_iterator<false>, ScrubBackend::match_in_shards(const hobject_t&, auth_selection_t&, inconsistent_obj_wrapper&, std::stringstream&)::<lambda(const std::pair<const shard_id_t, ceph::buffer::v15_2_0::list&>&)> > /usr/include/c++/15.2.1/bits/stl_algo.h:431 ceph#8 0x562c761d4a50 in any_of<mini_flat_map<shard_id_t, ceph::buffer::v15_2_0::list, signed char>::_iterator<false>, ScrubBackend::match_in_shards(const hobject_t&, auth_selection_t&, inconsistent_obj_wrapper&, s td::stringstream&)::<lambda(const std::pair<const shard_id_t, ceph::buffer::v15_2_0::list&>&)> > /usr/include/c++/15.2.1/bits/stl_algo.h:450 ceph#9 0x562c761bb84b in ScrubBackend::match_in_shards(hobject_t const&, auth_selection_t&, inconsistent_obj_wrapper&, std::__cxx11::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >&) /home/k efu/dev/ceph/src/osd/scrubber/scrub_backend.cc:1297 ceph#10 0x562c761b4282 in ScrubBackend::compare_obj_in_maps[abi:cxx11](hobject_t const&) /home/kefu/dev/ceph/src/osd/scrubber/scrub_backend.cc:941 ceph#11 0x562c761d44af in operator()<hobject_t> /home/kefu/dev/ceph/src/osd/scrubber/scrub_backend.cc:887 ceph#12 0x562c761d4836 in for_each<std::_Rb_tree_const_iterator<hobject_t>, ScrubBackend::compare_smaps()::<lambda(const auto:422&)> > /usr/include/c++/15.2.1/bits/stl_algo.h:3798 ceph#13 0x562c761b3259 in ScrubBackend::compare_smaps() /home/kefu/dev/ceph/src/osd/scrubber/scrub_backend.cc:884 ceph#14 0x562c761a478d in ScrubBackend::update_authoritative() /home/kefu/dev/ceph/src/osd/scrubber/scrub_backend.cc:315` ``` Fix by using bufferlist::length() which tells if the given buffer is empty instead of converting the buffer content to a string. Signed-off-by: Kefu Chai <tchaikov@gmail.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Signed-off-by: Alexey Lapitsky lex@realisticgroup.com