[release-7.4] Revert condition for releasing memory from tlog's versionLocation data structure #12298

spraza · 2025-08-13T22:56:29Z

Description

We introduced this change in #11815. The idea was that for tlog spill by reference, we should be using popped version, not persistent version, for the purposes of releasing memory from the logData->versionLocation data structure. This causes OOMs in remote tlogs under certain high traffic workloads after a multi-hour primary-remote disconnect i.e. after the disconnect is healed, remote tlogs' RSS footprint grows in an unbounded way, presumably because some storage server wasn't able to pop fast enough, causing memory build up in tlog.

This got us thinking whether we really need this change in the first place and whether it's safe to revert. After discussion with a few people, the understanding is that this should not cause a correctness issue since tlog restart can rebuild the state from disk. Joshua/simulation also have not shown this issue, we made the fix pre-emptively. Worst case potential cost is disk build up and that's because the versionLocation in-memory state is used for disk queue popping, and lack of the versions in memory could mean we don't release disk space fast enough. Having said that, we think even if it were to happen, this is transient and disk space would eventually be released because of how disk queue popping works.

Overall, we think this change is safe to revert.

Testing

Tested in a cluster that OOMs no longer happen.
100K Joshua: 20250813-225250-praza-743-tag-v216-01060c4e-43f3c5618750bb80 compressed=True data_size=40160171 duration=5570120 ended=100000 fail=1 fail_fast=10 max_runs=100000 pass=99999 priority=100 remaining=0 runtime=1:19:41 sanity=False started=100000 stopped=20250814-001231 submitted=20250813-225250 timeout=5400 username=praza-743-tag-v216-01060c4ee1fb5386e045bb9c0b086f07315ac3ef

For the 1 Joshua failure, it's noSim which does not setup a simulated cluster, and I can not reproduce:

$ ~/cnd_build_output/bin/fdbserver -r test -f /root/src/foundationdb/tests/noSim/KeyValueStoreRocksDBTest.toml --buggify on --seed 2701483725 --logsize 1GiB
startingConfiguration: start
useDB: false
Run test:UnitTests start
setting up test (UnitTests)...
Test received trigger for setup...
running test (UnitTests)...
Found 5 tests
Testing noSim/fdbserver/KeyValueStoreRocksDB/RocksDBReopen
Testing noSim/fdbserver/KeyValueStoreRocksDB/IngestSSTFileVisibility
Testing noSim/fdbserver/KeyValueStoreRocksDB/CheckpointRestoreKeyValues
Testing noSim/fdbserver/KeyValueStoreRocksDB/RocksDBBasic
Testing noSim/fdbserver/KeyValueStoreRocksDB/CheckpointRestoreColumnFamily
UnitTests complete
UnitTests complete
checking test (UnitTests)...
fetching metrics (UnitTests)...
Metric (0, 0): UnitTests.Test Cases Available, 5.000000, 5
Metric (0, 1): UnitTests.Test Cases Executed, 5.000000, 5
Metric (0, 2): UnitTests.Test Cases Failed, 0.000000, 0
Metric (0, 3): UnitTests.Total wall clock time (s), 0.345422, 0.345
Metric (0, 4): UnitTests.Total flow time (s), 0.345016, 0.345
Test complete
1 test clients passed; 0 test clients failed
Run test:UnitTests Done.

1 tests passed; 0 tests failed.

I also looked at the stack trace mentioned in Joshua logs, does not seem related to this change:

$ llvm-addr2line -e ~/cnd_build_output/bin/fdbserver -p -C -f -i 0x7f9a6ac3bbf0 0x622d241 0x48a1b9b 0x622cfb9 0x622cdcd 0x622cd29 0x7f9a6ac8719a
?? at ??:0
operator() at /root/src/foundationdb/flow/IThreadPool.cpp:85
 (inlined by) complete<ThreadPool::ActionWrapper> at /opt/boost_1_86_0_clang/include/boost/asio/detail/handler_work.hpp:470
 (inlined by) boost::asio::detail::completion_handler<ThreadPool::ActionWrapper, boost::asio::io_context::basic_executor_type<std::__1::allocator<void>, 0ul>>::do_complete(void*, boost::asio::detail::scheduler_operation*, boost::system::error_code const&, unsigned long) at /opt/boost_1_86_0_clang/include/boost/asio/detail/completion_handler.hpp:74
rethrow_pending_exception at /opt/boost_1_86_0_clang/include/boost/asio/detail/thread_info_base.hpp:231
 (inlined by) boost::asio::detail::scheduler::do_run_one(boost::asio::detail::conditionally_enabled_mutex::scoped_lock&, boost::asio::detail::scheduler_thread_info&, boost::system::error_code const&) at /opt/boost_1_86_0_clang/include/boost/asio/detail/impl/scheduler.ipp:494
boost::asio::detail::scheduler::run_one(boost::system::error_code&) at /opt/boost_1_86_0_clang/include/boost/asio/detail/impl/scheduler.ipp:0
run_one at /opt/boost_1_86_0_clang/include/boost/asio/impl/io_context.ipp:0
 (inlined by) ThreadPool::Thread::run() at /root/src/foundationdb/flow/IThreadPool.cpp:54
ThreadPool::start(void*) at /root/src/foundationdb/flow/IThreadPool.cpp:66
?? at ??:0

Code-Reviewer Section

The general pull request guidelines can be found here.

Please check each of the following things and check all boxes before accepting a PR.

The PR has a description, explaining both the problem and the solution.
The description mentions which forms of testing were done and the testing seems reasonable.
Every function/class/actor that was touched is reasonably well documented.

For Release-Branches

If this PR is made against a release-branch, please also check the following:

This change/bugfix is a cherry-pick from the next younger branch (younger release-branch or main if this is the youngest branch)
There is a good reason why this PR needs to go into a release branch and this reason is documented (either in the description above or in a linked GitHub issue)

…ogic that (apple#11815)" This reverts commit 3c43139.

foundationdb-ci · 2025-08-13T23:35:36Z

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

Commit ID: 01060c4
Duration 0:38:54
Result: ✅ SUCCEEDED
Error: N/A
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-08-13T23:48:55Z

Result of foundationdb-pr-clang-arm on Linux CentOS 7

Commit ID: 01060c4
Duration 0:52:17
Result: ✅ SUCCEEDED
Error: N/A
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-08-13T23:59:02Z

Result of foundationdb-pr-macos on macOS Ventura 13.x

Commit ID: 01060c4
Duration 1:02:23
Result: ✅ SUCCEEDED
Error: N/A
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-08-14T00:07:46Z

Result of foundationdb-pr-clang on Linux RHEL 9

Commit ID: 01060c4
Duration 1:11:08
Result: ✅ SUCCEEDED
Error: N/A
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-08-14T00:10:34Z

Result of foundationdb-pr-cluster-tests on Linux RHEL 9

Commit ID: 01060c4
Duration 1:13:51
Result: ✅ SUCCEEDED
Error: N/A
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)
Cluster Test Logs zip file of the test logs (available for 30 days)

foundationdb-ci · 2025-08-14T00:12:40Z

Result of foundationdb-pr on Linux RHEL 9

Commit ID: 01060c4
Duration 1:16:00
Result: ✅ SUCCEEDED
Error: N/A
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

Revert "- In case of spill by reference, log servers should use the l…

01060c4

…ogic that (apple#11815)" This reverts commit 3c43139.

spraza marked this pull request as draft August 13, 2025 22:56

spraza changed the title ~~Revert condition for releasing memory from tlog's versionLocation data structure~~ [release-7.4] Revert condition for releasing memory from tlog's versionLocation data structure Aug 13, 2025

spraza mentioned this pull request Aug 13, 2025

Forward port #12298: Revert condition for releasing memory from tlog's versionLocation data structure #12299

Merged

5 tasks

jzhou77 approved these changes Aug 14, 2025

View reviewed changes

spraza requested a review from sbodagala August 14, 2025 05:15

spraza marked this pull request as ready for review August 14, 2025 05:15

sbodagala approved these changes Aug 14, 2025

View reviewed changes

spraza merged commit 55cb585 into apple:release-7.4 Aug 14, 2025
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[release-7.4] Revert condition for releasing memory from tlog's versionLocation data structure #12298

[release-7.4] Revert condition for releasing memory from tlog's versionLocation data structure #12298

spraza commented Aug 13, 2025 •

edited

Loading

Uh oh!

foundationdb-ci commented Aug 13, 2025

Uh oh!

foundationdb-ci commented Aug 13, 2025

Uh oh!

foundationdb-ci commented Aug 13, 2025

Uh oh!

foundationdb-ci commented Aug 14, 2025

Uh oh!

foundationdb-ci commented Aug 14, 2025

Uh oh!

foundationdb-ci commented Aug 14, 2025

Uh oh!

Uh oh!

Uh oh!

[release-7.4] Revert condition for releasing memory from tlog's versionLocation data structure #12298

[release-7.4] Revert condition for releasing memory from tlog's versionLocation data structure #12298

Conversation

spraza commented Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Testing

Code-Reviewer Section

For Release-Branches

Uh oh!

foundationdb-ci commented Aug 13, 2025

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

Uh oh!

foundationdb-ci commented Aug 13, 2025

Result of foundationdb-pr-clang-arm on Linux CentOS 7

Uh oh!

foundationdb-ci commented Aug 13, 2025

Result of foundationdb-pr-macos on macOS Ventura 13.x

Uh oh!

foundationdb-ci commented Aug 14, 2025

Result of foundationdb-pr-clang on Linux RHEL 9

Uh oh!

foundationdb-ci commented Aug 14, 2025

Result of foundationdb-pr-cluster-tests on Linux RHEL 9

Uh oh!

foundationdb-ci commented Aug 14, 2025

Result of foundationdb-pr on Linux RHEL 9

Uh oh!

Uh oh!

Uh oh!

spraza commented Aug 13, 2025 •

edited

Loading