Skip to content

Conversation

@dungeon-master-666
Copy link

@dungeon-master-666 dungeon-master-666 commented Nov 16, 2025

I noticed that TryCatchUpWithPrimary takes a lot of time in my setup even when there were no changes in DB. In my case for 500 GB RocksDB instance it takes up to ~50 ms and most of this time is tail reading of MANIFEST file.

In this PR I implement the following optimization - to cache MANIFEST file size and if it did not grow since last call immediately return control. It makes sense for 2 reasons:

  • Manifest does not change often
  • Syscall to get file size is much cheaper operation than reading file from specific offset

Implementation details:

  • add a helper for ReactiveVersionSet to query the current MANIFEST size via the filesystem
  • cache the last known size using the existing VersionSet::manifest_file_size_ and skip ManifestTailer::Iterate() when the file hasn’t grown
  • reset the cached size whenever we reopen/switch the MANIFEST so the next pass reprocesses from the beginning

This optimization speeds up TryCatchUpWithPrimary up to 3 times in my tests.

@meta-cla meta-cla bot added the CLA Signed label Nov 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant