-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plug two leaks around dirty_stores #19025
Conversation
@@ -5233,8 +5235,11 @@ impl AccountsDb { | |||
); | |||
let count = store.remove_account(account_info.stored_size, reset_accounts); | |||
if count == 0 { | |||
self.dirty_stores | |||
.insert((*slot, store.append_vec_id()), store.clone()); | |||
// expected_slot is none should exclude write path from flush/shrink/bank drop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kind of hack, but handle_reclaims()
code path is literally maze.
@@ -2254,8 +2254,10 @@ impl AccountsDb { | |||
if let Some(slot_stores) = self.storage.get_slot_stores(slot) { | |||
slot_stores.write().unwrap().retain(|_key, store| { | |||
if store.count() == 0 { | |||
self.dirty_stores | |||
.insert((slot, store.append_vec_id()), store.clone()); | |||
if !is_startup { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
admittedly, this is another hack. xD
glad that we have managed to revive the almost-dying flag (_
-prefixed) for quick win. :)
note: there is no test (also ci failed...) also, here is a crude way to detect leak when shrink_all is running:
|
@@ -6009,7 +6019,7 @@ impl AccountsDb { | |||
if pass == 0 { | |||
// Need to add these last, otherwise older updates will be cleaned | |||
for slot in &slots { | |||
self.accounts_index.add_root(*slot, false); | |||
self.add_root_to_update_index(*slot, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We add any keys which have more than 1 key:
solana/runtime/src/accounts_db.rs
Lines 5904 to 5906 in d8984cb
if !dirty_pubkeys.is_empty() { | |
self.uncleaned_pubkeys.insert(*slot, dirty_pubkeys); | |
} |
This uses the uncleaned_pubkeys
and not dirty_stores
which is why the dirty_store
time is small, but I don't think we need to perform a clean on keys which only have 1 entry in them, no?
The other case is 0-lamport accounts. We can detect that in generate_index and also add it to the clean set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, now i view things clearly. I totally agree with you.
@@ -6009,7 +6019,7 @@ impl AccountsDb { | |||
if pass == 0 { | |||
// Need to add these last, otherwise older updates will be cleaned | |||
for slot in &slots { | |||
self.accounts_index.add_root(*slot, false); | |||
self.add_root_to_update_index(*slot, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here
Problem
there is regression since #17601
at start-up (
verify_snapshot_bank
), dirty_stores accumulatesArc
s to all shrunken stores becauseclean_accounts
aren't run inverify_snapshot_bank
after the initialclean_accounts()
.So that keeps mmap leaks. This surge of mmap use here is rather transient; after all shrinking and next clean_accounts(), it's usually
munmap()
ed en masse.also the initial
clean_accounts()
isn't complete as was before. That's becausecandidate_clean_keys
are created from emptydirty_stores
. So, ultimately, this leaves more appendvecs than ideally minimized.BEFORE
note the almost instant
dirty_store_processing_us
:Summary of Changes
make dirty_stores population logic more selective.
correctly populate dirty_stores in generate_index().
AFTER
LEFT is before, right is after
Remarks:
Also, I considered following approaches but turned them down:
dirty_stores
while bootingdirty_stores
:DashMap<_, std:sync::Arc<_>>
note that there is still very slow leaks, which is occurring even on v1.6.... this pr doesn't address that leak.