-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-1986. Fix listkeys API. #1588
Conversation
/label ozone |
8c7e306
to
f06893e
Compare
1a83fdb
to
676e024
Compare
// of maxKeys from the sorted map. | ||
currentCount = 0; | ||
|
||
for (Map.Entry<String, OmKeyInfo> cacheKey : cacheKeyMap.entrySet()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second iteration is unfortunate. We should see if there is a way to avoid it.
} | ||
|
||
@Test | ||
public void testListKeysWithFewDeleteEntriesInCache() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are you ensuring entries are in the cache? For that you have to pause double buffer flush right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this test case, we are adding entries in cache manually. It is not an integration test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change looks okay. I am worried about the potential performance of listKeys though. I am okay to let this go in for now since it fixes a correctness issue. However we will need to measure & fix list performance at some point.
List<OmKeyInfo> result = new ArrayList<>(); | ||
if (maxKeys == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or <= 0 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that would be a nice bit of defensive programming. Let's make the check <= 0
.
// and construct treeMap which match with keyPrefix and are greater than or | ||
// equal to startKey. Later we can revisit this, if list operation | ||
// is becoming slow. | ||
while (iterator.hasNext()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How many keys are expected in this cache? and how many in the tree ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel that we are better off leaving the old code in place...where we can read from the DB.. Worst, we might have to make sure that cache is flushed to DB before doing the list operation.But practically it may not matter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am ok with putting this change in if we can prove that we can do large list keys. You might want to borrow the DB from @nandakumar131 and see if you can list keys with this patch, just a thought.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The key cache is not full cache, so if double buffer flush is going on well in background, this should have around couple of 100 entries. When I started freon with 10 threads, i see the value of maximum iteration is 200. So, almost in the cache we have 200 entries. (But on tried with busy workload clusters, slow disks)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With current new code when the list happens we should consider entries from buffer and DB. (As we return the response to end-user after adding entries to cache). So, if user does list as next operation(next to create bucket) the bucket might/might not be there until double buffer flushes. As until double buffer flushes, we will have entries in cache. (This will not be problem for non-HA, as we return the response, only after the flush)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
I am okay to get this committed with a minor comment below, assuming there are no unaddressed comments from @anuengineer.
We should benchmark list operations later in case any further optimization is needed.
Addressed the review comment. |
I will run the benchmarks once after the list Operations is fixed. |
Let us get this in, I expect some of these things we will learn the right choices only when we really benchmark and test. The sad this is that some of these changes can make our system unstable. |
Thank You @arp7 and @anuengineer for the review. |
c250efb
to
d7448c3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall the patch looks ok.
We should definitely try to optimise the listKeys call.
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmMetadataManagerImpl.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @bharatviswa504 for working on this, overall the change looks ok.
d7448c3
to
5ef53de
Compare
Thank You all for the review. |
Implement listKeys API.