Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix delete obsolete files on recovery not rate limited #12590

Closed
wants to merge 5 commits into from

Conversation

jowlyzhang
Copy link
Contributor

@jowlyzhang jowlyzhang commented Apr 26, 2024

This PR fix the issue that deletion of obsolete files during DB::Open are not rate limited.

The root cause is slow deletion is disabled if trash/db size ratio exceeds the configured max_trash_db_ratio

double max_trash_db_ratio = 0.25,
however, the current handling in DB::Open starts with tracking nothing but the obsolete files. This will make the ratio always look like it's 1.

In order for the deletion rate limiting logic to work properly, we should only start deleting files after SstFileManager has finished tracking the whole DB, so the main fix is to move these two places that attempts to delete file after the tracking are done: 1) the DeleteScheduler::CleanupDirectory call in SanitizeOptions, 2) the DB::DeleteObsoleteFiles call.

There are some other aesthetic changes like refactoring collecting all the DB paths into a function, rename DBImp::DeleteUnreferencedSstFiles to DBImpl:: MaybeUpdateNextFileNumber as it doesn't actually delete the files.

Test plan:
Added unit test and verified with manual testing

Copy link
Contributor

@pdillinger pdillinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First impressions. Thanks for the fix!

// A set of all the directories used by this DB.
// Note that these paths do not have a tailing path separator "/"
// (kFilePathSeparator).
std::unordered_set<std::string> all_db_paths_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My sense is that it would be better to use an ordered set here, for predictability of behavior. For example I believe there are some cases where the error returned on a multiply-corrupt DB depends on the iteration order of this set. It's less confusing if that error is not dependent on factors like how RocksDB was compiled. This could also appear as flaky test behavior in the future.

Data structure performance should not really be a concern here, as these data structure ops are generally paired with IO ops.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this change that may cause regression.

mutex_.AssertHeld();
std::vector<std::string> paths;
paths.push_back(NormalizePath(dbname_ + std::string(1, kFilePathSeparator)));
void DBImpl::CollectAllDBPaths() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this functionality factored into a function of a giant class, I would like to see enforcement of preconditions. Specifically, either check that it hasn't been called before, or allow repeated calls with no ill effect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a set empty before populating check.

@@ -1559,22 +1558,38 @@ class DBImpl : public DB {
// Assign db_id_ and write DB ID to manifest if necessary.
void SetDBId(std::string&& id, bool read_only, RecoveryContext* recovery_ctx);

// Collect a deduplicated collection of paths used by this DB, including
// dbname_, DBOptions.db_paths, ColumnFamilyOptions.cf_paths. These paths
// cannot be dynamically changed, so they are collected once during DB
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this true even if CreateColumnFamily is called?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great point. This is indeed not true when new column families are added during DB session, so this set needs to be refreshed there too. Thank you for catching this!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally missed how column family operations would affect this set. The whole assumption that this set is fixed once DB initialized and is kept intact through the whole DB session is not true at all. So it's not useful to have a member variable storing them like this. Although I can refresh the member variable when column families are updated, the case for when column families are added are relatively straightforward but the case for when a column family becomes completely dereferenced is more nuanced.

So instead I just make the function a pure util function for collecting these paths on demand, it's just a code dedupe refactor.

Copy link
Contributor

@anand1976 anand1976 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for this fix.

@@ -1023,4 +997,47 @@ Status DBImpl::DeleteUnreferencedSstFiles(RecoveryContext* recovery_ctx) {
return s;
}

void DBImpl::TrackExistingDataFiles(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Maybe move this to db_impl_open.cc since its only called by DBImpl::Open?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good call. Thank you for the suggestion!

@facebook-github-bot
Copy link
Contributor

@jowlyzhang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@jowlyzhang
Copy link
Contributor Author

Thank you for help review this change!

@facebook-github-bot
Copy link
Contributor

@jowlyzhang merged this pull request in 2412530.

@jowlyzhang jowlyzhang deleted the discard_stall_on_open branch May 2, 2024 18:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants