HDDS-1986. Fix listkeys API. #1588

bharatviswa504 · 2019-10-03T22:54:41Z

Implement listKeys API.

smengcl · 2019-10-03T22:55:40Z

/label ozone

arp7 · 2019-10-08T01:29:57Z

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmMetadataManagerImpl.java

+    // of maxKeys from the sorted map.
+    currentCount = 0;
+
+    for (Map.Entry<String, OmKeyInfo>  cacheKey : cacheKeyMap.entrySet()) {


The second iteration is unfortunate. We should see if there is a way to avoid it.

arp7 · 2019-10-08T01:35:20Z

hadoop-ozone/ozone-manager/src/test/java/org/apache/hadoop/ozone/om/TestOmMetadataManager.java

+  }
+
+  @Test
+  public void testListKeysWithFewDeleteEntriesInCache() throws Exception {


How are you ensuring entries are in the cache? For that you have to pause double buffer flush right?

In this test case, we are adding entries in cache manually. It is not an integration test.

arp7

The change looks okay. I am worried about the potential performance of listKeys though. I am okay to let this go in for now since it fixes a correctness issue. However we will need to measure & fix list performance at some point.

anuengineer · 2019-10-07T21:55:51Z

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmMetadataManagerImpl.java

    List<OmKeyInfo> result = new ArrayList<>();
+    if (maxKeys == 0) {


Yeah that would be a nice bit of defensive programming. Let's make the check <= 0.

anuengineer · 2019-10-07T21:57:32Z

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmMetadataManagerImpl.java

+    // and construct treeMap which match with keyPrefix and are greater than or
+    // equal to startKey. Later we can revisit this, if list operation
+    // is becoming slow.
+    while (iterator.hasNext()) {


How many keys are expected in this cache? and how many in the tree ?

I feel that we are better off leaving the old code in place...where we can read from the DB.. Worst, we might have to make sure that cache is flushed to DB before doing the list operation.But practically it may not matter.

I am ok with putting this change in if we can prove that we can do large list keys. You might want to borrow the DB from @nandakumar131 and see if you can list keys with this patch, just a thought.

The key cache is not full cache, so if double buffer flush is going on well in background, this should have around couple of 100 entries. When I started freon with 10 threads, i see the value of maximum iteration is 200. So, almost in the cache we have 200 entries. (But on tried with busy workload clusters, slow disks)

With current new code when the list happens we should consider entries from buffer and DB. (As we return the response to end-user after adding entries to cache). So, if user does list as next operation(next to create bucket) the bucket might/might not be there until double buffer flushes. As until double buffer flushes, we will have entries in cache. (This will not be problem for non-HA, as we return the response, only after the flush)

arp7

+1

I am okay to get this committed with a minor comment below, assuming there are no unaddressed comments from @anuengineer.

We should benchmark list operations later in case any further optimization is needed.

bharatviswa504 · 2019-10-10T03:38:35Z

Addressed the review comment.

bharatviswa504 · 2019-10-10T03:39:23Z

+1

I am okay to get this committed with a minor comment below, assuming there are no unaddressed comments from @anuengineer.

We should benchmark list operations later in case any further optimization is needed.

I will run the benchmarks once after the list Operations is fixed.

anuengineer · 2019-10-10T16:28:51Z

Let us get this in, I expect some of these things we will learn the right choices only when we really benchmark and test. The sad this is that some of these changes can make our system unstable.
FYI: @elek , I know that you might not be happy with this approach. But it is hard to judge the impact of these change till we have this code in and start testing.

bharatviswa504 · 2019-10-10T16:46:33Z

Thank You @arp7 and @anuengineer for the review.
I will try to run the benchmark and update it.

nandakumar131

Overall the patch looks ok.
We should definitely try to optimise the listKeys call.

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmMetadataManagerImpl.java

nandakumar131

Thanks @bharatviswa504 for working on this, overall the change looks ok.

bharatviswa504 · 2019-10-10T23:49:26Z

Thank You all for the review.
I have committed this to the trunk.

bharatviswa504 force-pushed the HDDS-1986 branch from 8c7e306 to f06893e Compare October 3, 2019 22:57

bharatviswa504 added the ozone label Oct 3, 2019

bharatviswa504 self-assigned this Oct 3, 2019

bharatviswa504 force-pushed the HDDS-1986 branch 2 times, most recently from 1a83fdb to 676e024 Compare October 4, 2019 23:36

arp7 reviewed Oct 8, 2019

View reviewed changes

arp7 approved these changes Oct 8, 2019

View reviewed changes

anuengineer reviewed Oct 8, 2019

View reviewed changes

arp7 approved these changes Oct 10, 2019

View reviewed changes

bharatviswa504 force-pushed the HDDS-1986 branch from c250efb to d7448c3 Compare October 10, 2019 17:38

nandakumar131 requested changes Oct 10, 2019

View reviewed changes

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmMetadataManagerImpl.java Show resolved Hide resolved

nandakumar131 approved these changes Oct 10, 2019

View reviewed changes

bharatviswa504 added 4 commits October 10, 2019 13:21

HDDS-1986. Fix listkeys API.

eef6ea4

HDDS-1986. add test for key marked for delete and few changes in code.

70e12c8

Fix review comment

f1a2cb5

fix jenkins checkstyle.

5ef53de

bharatviswa504 force-pushed the HDDS-1986 branch from d7448c3 to 5ef53de Compare October 10, 2019 20:30

apache deleted a comment from hadoop-yetus Oct 10, 2019

bharatviswa504 merged commit 9c72bf4 into apache:trunk Oct 10, 2019

amahussein pushed a commit to amahussein/hadoop that referenced this pull request Oct 29, 2019

HDDS-1986. Fix listkeys API. (apache#1588)

410873a

RogPodge pushed a commit to RogPodge/hadoop that referenced this pull request Mar 25, 2020

HDDS-1986. Fix listkeys API. (apache#1588)

0fbd7c6

		List<OmKeyInfo> result = new ArrayList<>();
		if (maxKeys == 0) {

HDDS-1986. Fix listkeys API. #1588

HDDS-1986. Fix listkeys API. #1588

Uh oh!

Conversation

bharatviswa504 commented Oct 3, 2019

Uh oh!

smengcl commented Oct 3, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arp7 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arp7 left a comment

Choose a reason for hiding this comment

Uh oh!

bharatviswa504 commented Oct 10, 2019

Uh oh!

bharatviswa504 commented Oct 10, 2019

Uh oh!

anuengineer commented Oct 10, 2019

Uh oh!

bharatviswa504 commented Oct 10, 2019

Uh oh!

nandakumar131 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nandakumar131 left a comment

Choose a reason for hiding this comment

Uh oh!

bharatviswa504 commented Oct 10, 2019

Uh oh!

Uh oh!

arp7 left a comment •

edited

Loading