-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a blob-specific cache priority #10309
Conversation
The basic idea of this PR is:
|
bdadb6e
to
3d94420
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @gangliao for the patch! Please see some questions/comments below
@gangliao has updated the pull request. You must reimport the pull request before landing. |
@gangliao has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@@ -454,6 +454,7 @@ void BlockBasedTableFactory::InitializeOptions() { | |||
// It makes little sense to pay overhead for mid-point insertion while the | |||
// block size is only 8MB. | |||
co.high_pri_pool_ratio = 0.0; | |||
co.low_pri_pool_ratio = 1.0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: I see that in LRUCache, the default of low_pri_pool_ratio is 0.5, while it is 1.0 here. Am I missing anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use a constant to represent the default value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@riversand963 's comment just made me realize that we would probably want to adjust the default of low_pri_pool_ratio
for the purposes of backward compatibility. That is, I think what we would want is a default of say, -1, where negative values would be interpreted as "use (1 - high_pri_pool_ratio)". With the current default of 0.5, we run the risk of applications hitting errors during cache creation if they set high_pri_pool_ratio to a value greater than 0.5.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, we could make negative values mean "unlimited low-priority pool capacity."
Side note: it would make sense for the pool capacity data members to be size_t
s as opposed to double
s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
What does the performance test say? That needs to be in "test plan" for non-trivial changes to performance-critical structures |
Yeah, we will do a bunch of perf tests during this week. |
@gangliao has updated the pull request. You must reimport the pull request before landing. |
include/rocksdb/cache.h
Outdated
low_pri_pool_ratio(_low_pri_pool_ratio == kDefaultLowPriortyPoolRatio | ||
? 1.0 - _high_pri_pool_ratio | ||
: _low_pri_pool_ratio), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel it would be nicer to deal with this inside LRUCache
. Also, we could treat all out-of-range (< 0 or > 1) values as "use 1 - high pro pool ratio" (or alternatively, as "no limit on the size of the low-priority pool" as we discussed)
include/rocksdb/cache.h
Outdated
const double kDefaultLowPriortyPoolRatio = -1.0; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defining a namespace-scope variable in the header like this can lead to ODR violations. Also, the variable name is misspelled (then again, the simplest solution would be to ditch this constant)
include/rocksdb/cache.h
Outdated
assert(high_pri_pool_ratio >= 0.0 && high_pri_pool_ratio <= 1.0); | ||
assert(low_pri_pool_ratio >= 0.0 && low_pri_pool_ratio <= 1.0); | ||
assert(high_pri_pool_ratio + low_pri_pool_ratio <= 1.0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think these assertions are legit in the sense that they are not guaranteed hold when the object is created (the range checks are done in NewLRUCache
IIRC)
I'm not sure why there is a test failure. Even after rolling back to Sun Jul 17, the problem still exists. @ltamasi ./lru_cache_test --gtest_filter=DBSecondaryCacheTest.TestSecondaryCacheCorrectness1
Note: Google Test filter = DBSecondaryCacheTest.TestSecondaryCacheCorrectness1
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from DBSecondaryCacheTest
[ RUN ] DBSecondaryCacheTest.TestSecondaryCacheCorrectness1
Assertion failed: (cache_), function BlobFileCache, file db/blob/blob_file_cache.cc, line 34.
fish: Job 1, './lru_cache_test --gtest_filter…' terminated by signal SIGABRT (Abort) Okay, just locate the bug is from this PR. |
@gangliao has updated the pull request. You must reimport the pull request before landing. |
@gangliao has updated the pull request. You must reimport the pull request before landing. |
@gangliao has updated the pull request. You must reimport the pull request before landing. |
@gangliao has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
I believe what @pdillinger was referring to above is making sure the change does not cause a performance regression. The way to do that would be to compare |
Summary: This reverts commit 8d17809 because of a clear performance regression seen in internal dashboard https://fburl.com/unidash/tpz75iee
} else { | ||
// Insert "e" to the head of low-pri pool. Note that when | ||
// high_pri_pool_ratio is 0, head of low-pri pool is also head of LRU list. | ||
} else if (low_pri_pool_ratio_ > 0 && (e->IsLowPri() || e->HasHit())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I'm fairly sure this is the line which causes the regression (see #10434 ). I think we have a bug causing priority inversion here: if there is no high-priority pool configured but there is a low-priority pool, high-priority items can end up in the bottom-priority pool. If so, it should be an easy fix
Summary: This reverts commit 8d17809 because of a clear performance regression seen in internal dashboard https://fburl.com/unidash/tpz75iee Pull Request resolved: #10434 Reviewed By: ltamasi Differential Revision: D38256373 Pulled By: pdillinger fbshipit-source-id: 134aa00f50dd7b1bbe037c227884a351342ec44b
Summary:
RocksDB's
Cache
abstraction currently supports two priority levels for items: high (used for frequently accessed/highly valuable SST metablocks like index/filter blocks) and low (used for SST data blocks). Blobs are typically lower-value targets for caching than data blocks, since 1) with BlobDB, data blocks containing blob references conceptually form an index structure which has to be consulted before we can read the blob value, and 2) cached blobs represent only a single key-value, while cached data blocks generally contain multiple KVs. Since we would like to make it possible to use the same backing cache for the block cache and the blob cache, it would make sense to add a new, lower-than-low cache priority level (bottom level) for blobs so data blocks are prioritized over them.This task is a part of #10156