-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-24387 TableSnapshotInputFormatImpl support row limit on each InputSplit #1731
Conversation
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
2e413d5
to
c4c3e4c
Compare
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
@HorizonNet @virajjasani mind help review this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall. Let me kick off the build again to have a clean one.
} | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: Can you please remove this additional empty line?
Oh, was wrong. Jenkins already reports it as green. |
@@ -274,6 +274,7 @@ message Scan { | |||
} | |||
optional ReadType readType = 23 [default = DEFAULT]; | |||
optional bool need_cursor_result = 24 [default = false]; | |||
optional uint32 limit = 25; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought this is duplicate with ScanRequest protobuf? We already have one field to means "limit".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to add a new filed to means "scan limit"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, duplicate field may confuse user
b4e1f97
to
3dcefa2
Compare
🎊 +1 overall
This message was automatically generated. |
Job job = new Job(UTIL.getConfiguration()); | ||
Path tmpTableDir = UTIL.getDataTestDirOnTestFS(snapshotName); | ||
// limit the scan | ||
Scan scan = new Scan().setLimit(10); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall. Left two minor points.
@@ -304,6 +305,56 @@ public void testWithMockedMapReduceWithNoStartRowStopRow() throws Exception { | |||
} | |||
} | |||
|
|||
@Test | |||
public void testScanLimit() throws Exception { | |||
setupCluster(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to move this one to a test initializer method? I see it used in several tests in here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is better to submit another PR to fix?after this one merged, I will amend it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's up to you. We can also do it in a separate ticket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you like to help merge this PR? Thanks
UTIL.getConfiguration().unset(SNAPSHOT_INPUTFORMAT_ROW_LIMIT_PER_INPUTSPLIT); | ||
UTIL.getAdmin().deleteSnapshot(snapshotName); | ||
UTIL.deleteTable(tableName); | ||
tearDownCluster(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
…putSplit (#1731) Signed-off-by: Guanghao Zhang <zghao@apache.org>
…putSplit (#1731) Signed-off-by: Guanghao Zhang <zghao@apache.org>
…putSplit (#1731) Signed-off-by: Guanghao Zhang <zghao@apache.org>
…putSplit (apache#1731) Signed-off-by: Guanghao Zhang <zghao@apache.org>
…putSplit (apache#1731) Signed-off-by: Guanghao Zhang <zghao@apache.org>
…putSplit (apache#1731) Signed-off-by: Guanghao Zhang <zghao@apache.org> (cherry picked from commit fff59f4) Change-Id: I48ee436d0daa357af0d3d324ae80e334db450ddd
In some scenario , We want to scan limited rows on each InputSplit for sampling data extraction