Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a blob-specific cache priority #10309

Closed
wants to merge 25 commits into from
Closed

Conversation

gangliao
Copy link
Contributor

@gangliao gangliao commented Jul 5, 2022

Summary:

RocksDB's Cache abstraction currently supports two priority levels for items: high (used for frequently accessed/highly valuable SST metablocks like index/filter blocks) and low (used for SST data blocks). Blobs are typically lower-value targets for caching than data blocks, since 1) with BlobDB, data blocks containing blob references conceptually form an index structure which has to be consulted before we can read the blob value, and 2) cached blobs represent only a single key-value, while cached data blocks generally contain multiple KVs. Since we would like to make it possible to use the same backing cache for the block cache and the blob cache, it would make sense to add a new, lower-than-low cache priority level (bottom level) for blobs so data blocks are prioritized over them.

This task is a part of #10156

@gangliao gangliao changed the title Add a blob-specific cache priority [Draft] Add a blob-specific cache priority Jul 5, 2022
@gangliao gangliao mentioned this pull request Jul 8, 2022
14 tasks
@gangliao gangliao changed the title [Draft] Add a blob-specific cache priority Add a blob-specific cache priority Jul 9, 2022
@gangliao
Copy link
Contributor Author

gangliao commented Jul 11, 2022

The basic idea of ​​this PR is:

  • Add a new priority level, the BOTTOM level, for blob caching. HIGH > LOW > BOTTOM
  • add a new midpoint for bottom level
x x x x B x x x x x x x x x L x x x x x H
        |                   |
    mid-point           mid-point

@gangliao gangliao force-pushed the cache_pri branch 2 times, most recently from bdadb6e to 3d94420 Compare July 19, 2022 21:44
Copy link
Contributor

@ltamasi ltamasi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @gangliao for the patch! Please see some questions/comments below

cache/cache_bench_tool.cc Outdated Show resolved Hide resolved
HISTORY.md Outdated Show resolved Hide resolved
include/rocksdb/cache.h Outdated Show resolved Hide resolved
cache/lru_cache.cc Show resolved Hide resolved
cache/lru_cache.cc Outdated Show resolved Hide resolved
cache/lru_cache.cc Outdated Show resolved Hide resolved
cache/lru_cache.cc Outdated Show resolved Hide resolved
cache/lru_cache.h Outdated Show resolved Hide resolved
cache/lru_cache_test.cc Outdated Show resolved Hide resolved
cache/lru_cache_test.cc Outdated Show resolved Hide resolved
cache/lru_cache_test.cc Outdated Show resolved Hide resolved
cache/lru_cache_test.cc Outdated Show resolved Hide resolved
include/rocksdb/cache.h Outdated Show resolved Hide resolved
HISTORY.md Outdated Show resolved Hide resolved
cache/compressed_secondary_cache_test.cc Outdated Show resolved Hide resolved
cache/lru_cache.cc Show resolved Hide resolved
cache/lru_cache.cc Outdated Show resolved Hide resolved
cache/lru_cache.cc Outdated Show resolved Hide resolved
@facebook-github-bot
Copy link
Contributor

@gangliao has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@gangliao has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@@ -454,6 +454,7 @@ void BlockBasedTableFactory::InitializeOptions() {
// It makes little sense to pay overhead for mid-point insertion while the
// block size is only 8MB.
co.high_pri_pool_ratio = 0.0;
co.low_pri_pool_ratio = 1.0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: I see that in LRUCache, the default of low_pri_pool_ratio is 0.5, while it is 1.0 here. Am I missing anything?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use a constant to represent the default value?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@riversand963 's comment just made me realize that we would probably want to adjust the default of low_pri_pool_ratio for the purposes of backward compatibility. That is, I think what we would want is a default of say, -1, where negative values would be interpreted as "use (1 - high_pri_pool_ratio)". With the current default of 0.5, we run the risk of applications hitting errors during cache creation if they set high_pri_pool_ratio to a value greater than 0.5.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could make negative values mean "unlimited low-priority pool capacity."

Side note: it would make sense for the pool capacity data members to be size_ts as opposed to doubles.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

@pdillinger
Copy link
Contributor

What does the performance test say? That needs to be in "test plan" for non-trivial changes to performance-critical structures

@gangliao
Copy link
Contributor Author

What does the performance test say? That needs to be in "test plan" for non-trivial changes to performance-critical structures

Yeah, we will do a bunch of perf tests during this week.

@facebook-github-bot
Copy link
Contributor

@gangliao has updated the pull request. You must reimport the pull request before landing.

Comment on lines 123 to 125
low_pri_pool_ratio(_low_pri_pool_ratio == kDefaultLowPriortyPoolRatio
? 1.0 - _high_pri_pool_ratio
: _low_pri_pool_ratio),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel it would be nicer to deal with this inside LRUCache. Also, we could treat all out-of-range (< 0 or > 1) values as "use 1 - high pro pool ratio" (or alternatively, as "no limit on the size of the low-priority pool" as we discussed)

Comment on lines 51 to 52
const double kDefaultLowPriortyPoolRatio = -1.0;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defining a namespace-scope variable in the header like this can lead to ODR violations. Also, the variable name is misspelled (then again, the simplest solution would be to ditch this constant)

Comment on lines 129 to 131
assert(high_pri_pool_ratio >= 0.0 && high_pri_pool_ratio <= 1.0);
assert(low_pri_pool_ratio >= 0.0 && low_pri_pool_ratio <= 1.0);
assert(high_pri_pool_ratio + low_pri_pool_ratio <= 1.0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these assertions are legit in the sense that they are not guaranteed hold when the object is created (the range checks are done in NewLRUCache IIRC)

@gangliao
Copy link
Contributor Author

gangliao commented Jul 27, 2022

I'm not sure why there is a test failure. Even after rolling back to Sun Jul 17, the problem still exists. @ltamasi

 ./lru_cache_test --gtest_filter=DBSecondaryCacheTest.TestSecondaryCacheCorrectness1
Note: Google Test filter = DBSecondaryCacheTest.TestSecondaryCacheCorrectness1
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from DBSecondaryCacheTest
[ RUN      ] DBSecondaryCacheTest.TestSecondaryCacheCorrectness1
Assertion failed: (cache_), function BlobFileCache, file db/blob/blob_file_cache.cc, line 34.
fish: Job 1, './lru_cache_test --gtest_filter…' terminated by signal SIGABRT (Abort)

Okay, just locate the bug is from this PR.

@facebook-github-bot
Copy link
Contributor

@gangliao has updated the pull request. You must reimport the pull request before landing.

cache/lru_cache.cc Outdated Show resolved Hide resolved
@facebook-github-bot
Copy link
Contributor

@gangliao has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@gangliao has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@gangliao has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@ltamasi
Copy link
Contributor

ltamasi commented Jul 28, 2022

What does the performance test say? That needs to be in "test plan" for non-trivial changes to performance-critical structures

Yeah, we will do a bunch of perf tests during this week.

I believe what @pdillinger was referring to above is making sure the change does not cause a performance regression. The way to do that would be to compare db_bench results with and without the change.

pdillinger added a commit to pdillinger/rocksdb that referenced this pull request Jul 28, 2022
Summary: This reverts commit 8d17809
because of a clear performance regression seen in internal dashboard
https://fburl.com/unidash/tpz75iee
} else {
// Insert "e" to the head of low-pri pool. Note that when
// high_pri_pool_ratio is 0, head of low-pri pool is also head of LRU list.
} else if (low_pri_pool_ratio_ > 0 && (e->IsLowPri() || e->HasHit())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I'm fairly sure this is the line which causes the regression (see #10434 ). I think we have a bug causing priority inversion here: if there is no high-priority pool configured but there is a low-priority pool, high-priority items can end up in the bottom-priority pool. If so, it should be an easy fix

facebook-github-bot pushed a commit that referenced this pull request Jul 29, 2022
Summary:
This reverts commit 8d17809
because of a clear performance regression seen in internal dashboard
https://fburl.com/unidash/tpz75iee

Pull Request resolved: #10434

Reviewed By: ltamasi

Differential Revision: D38256373

Pulled By: pdillinger

fbshipit-source-id: 134aa00f50dd7b1bbe037c227884a351342ec44b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants