-
Notifications
You must be signed in to change notification settings - Fork 246
Fix historical query LRU cache size calculations #7511
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix historical query LRU cache size calculations #7511
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes a critical bug in the historical query LRU cache size calculations. The adjust_ranges method was not properly tracking removed sequence numbers when a request's range changed to a disjoint prefix, causing estimated_store_cache_size to become inflated with orphaned counts.
Key Changes:
- Fixed
adjust_rangesto collect removed seqnos before erasing them (lines 314-317) - Simplified
any_difftracking by deriving it from removed/added vectors instead of maintaining a separate boolean - Added
get_estimated_store_cache_size()accessor for testing
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
src/node/historical_queries.h |
Core bug fix in adjust_ranges to track suffix removals, simplified any_diff logic, added size accessor method |
src/node/test/historical_queries.cpp |
Added regression test for cache size estimation and randomized test for adjust_ranges correctness |
Co-authored-by: Max <maxtropets@microsoft.com> (cherry picked from commit 4505d32)
We're logging the
FAILline inlru_shrink_to_fita surprising amount. This is a symptom of a deeper bug - theestimated_store_cache_sizeis not accurately kept up-to-date with the in-memory stores, due to a bug inadjust_ranges. There is a path inadjust_rangesto handle the new range being a disjoint prefix of the previous range, and when that was hit a suffix of seqnos were discarded without being added toremoved.There's a unit test that captures a minimal repro of this - handle X was previously requesting seqno 100, ledger entry at 100 is provided, handle X is updated to request seqno 42. We drop the state for 100 as expected, but didn't previously remove the refs to decrement
estimated_store_cache_size, so we had some orphaned counts in there that could inflate forever.As a little bit of paranoia, there's then a randomised variant of the test, exposing the internal
adjust_rangesfunction and calling it with various randomly generated arguments, confirming that the[removed, added]pair it returns (calculated by some awkward iterator logic internally) matches what we'd expect from set differences on the caller side.While there, do a related bit of simplification: don't track an explicit
any_diffbool. It should be (and now is) equivalent to!removed.empty() || !added.empty().