-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-28468: Integration of time-based priority caching in eviction logic. #5826
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are passing @vinayakphegde 's commit together with yours in this PR, and this is causing conflicts with the target branch (the PR can't be merged). Please rebase your local branch first, resolve any conflicts locally and force push to your branch, making sure the PR is mergeable.
…ogic. The time-based priority caching relies on the presence of file paths in the block-cache key. However, in case of the persitent cache, the file paths are not persisted in the files. Hence, when the region server is restarted, the block cache keys need to be repopulated with the file paths. This change addresses the following: 1. Always populate the block-cache key with path during its creation. 2. Fetch the file paths corresponding to the file names of the block-cache key during restarts. 3. Use the Data-Tiering-Manager APIs during cache-full scenario to evict the cold file blocks. Change-Id: Ice19bd41064c73538ee3d3755057813a531b9171
BlockCacheKey key = new BlockCacheKey(protoKey.getHfilename(), protoKey.getOffset(), | ||
protoKey.getPrimaryReplicaBlock(), fromPb(protoKey.getBlockType())); | ||
|
||
BlockCacheKey key; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need handling with and without paths in the BlockCacheKey creation
💔 -1 overall
This message was automatically generated. |
Ack! |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a UT for the eviction logic? I can't see any test for that.
@@ -79,7 +79,7 @@ public void run() { | |||
// so we check first if the block exists on its in-memory index, if so, we just | |||
// update the offset and move on to the next block without actually going read all | |||
// the way to the cache. | |||
BlockCacheKey cacheKey = new BlockCacheKey(name, offset); | |||
BlockCacheKey cacheKey = new BlockCacheKey(path, offset, true, BlockType.DATA); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you know this is a Data block? At this point, I don't think this can be assumed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The constructor used earlier(BlockCacheKey(name, offset);) sets these parameters internally. Hence, I used these parameters:
public BlockCacheKey(String hfileName, long offset) {
this(hfileName, offset, true, BlockType.DATA);
}
// Data-Tiering manager has not been set up. | ||
// Ignore the error and proceed with the normal flow. | ||
LOG.error("Error while getting DataTieringManager instance: {}", e.getMessage()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this to be expected often? If so, let's lower the log level to DEBUG. If not, it seems we can still continue RS normal functioning, so it should rather be a WARNING than an ERROR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack!
|
||
// Check the list of files to determine the cold files which can be readily evicted. | ||
Set<String> coldFiles = | ||
DataTieringManager.getInstance().getColdDataFiles(backingMap.keySet()); | ||
if (coldFiles != null) { | ||
for(String fileName : coldFiles) { | ||
evictBlocksByHfileName(fileName); | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do like this here, but then can we think if we can modify DataTieringManager somehow to track when we have no CF with TIME_RANGE type? That way we could avoid this extra loops through the whole set of block keys unnecessarily.
Alternatively, we could follow the BucketEntryGroup logic starting from #987. We could define a COLD priority group and add that as the first group in the priority queue. That way we don't need this extra loop over the whole block sets to find out which blocks are cold. We would leverage the already existing loop from line #996 for that.
Can we close this PR and favour the #5829 as the solution for HBASE-28468? |
ack! |
Closing this PR in favour of #5829 as the solution for HBASE-28468? |
HBASE-28468: Integration of time-based priority caching in eviction logic.
The time-based priority caching relies on the presence of file paths in the block-cache key.
However, in case of the persitent cache, the file paths are not persisted in the files.
Hence, when the region server is restarted, the block cache keys need to be
repopulated with the file paths.
This change addresses the following:
during restarts.