Skip to content

Conversation

@shfshihuafeng
Copy link
Contributor

@shfshihuafeng shfshihuafeng commented Jul 11, 2025

DRILL-8528: HBase Limit Push Down

Description

support Limit Push Down for HBase.i test

select * from clicks limit 3

The log shows that The storage layer hbase (HBaseRecordReader)only got 3 rows of data

2025-07-11 01:48:17,297 [178f302e-26db-37dd-500d-db2797b17a5c:frag:0:0] INFO o.a.d.e.s.hbase.HBaseRecordReader - Took 7 ms to get 3 records

plan is as follow

00-00    Screen : rowType = RecordType(ANY row_key, (VARCHAR(65535), ANY) MAP clickinfo, (VARCHAR(65535), ANY) MAP iteminfo): rowcount = 3.0, cumulative cost = {18.3 rows, 51.3 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 301
00-01      Project(row_key=[$0], clickinfo=[$1], iteminfo=[$2]) : rowType = RecordType(ANY row_key, (VARCHAR(65535), ANY) MAP clickinfo, (VARCHAR(65535), ANY) MAP iteminfo): rowcount = 3.0, cumulative cost = {18.0 rows, 51.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 300
00-02        SelectionVectorRemover : rowType = RecordType(ANY row_key, (VARCHAR(65535), ANY) MAP clickinfo, (VARCHAR(65535), ANY) MAP iteminfo): rowcount = 3.0, cumulative cost = {15.0 rows, 42.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 297
00-03          Limit(fetch=[3]) : rowType = RecordType(ANY row_key, (VARCHAR(65535), ANY) MAP clickinfo, (VARCHAR(65535), ANY) MAP iteminfo): rowcount = 3.0, cumulative cost = {12.0 rows, 39.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 296
00-04            Scan(table=[[hbase, clicks]], groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec [tableName="clicks"], columns=[`row_key`, `clickinfo`, `iteminfo`], maxRecords=3]]) : rowType = RecordType(ANY row_key, (VARCHAR(65535), ANY) MAP clickinfo, (VARCHAR(65535), ANY) MAP iteminfo): rowcount = 9.0, cumulative cost = {9.0 rows, 27.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 295

Documentation

(Please describe user-visible changes similar to what should appear in the Drill documentation.)

Testing

add test: org.apache.drill.hbase.TestHBaseFilterPushDown#testLimitPushDown

@shfshihuafeng shfshihuafeng force-pushed the hbase_limit_pushdown branch from 50da524 to b698849 Compare July 11, 2025 09:07
@cgivre cgivre self-requested a review July 11, 2025 12:28
@cgivre cgivre added enhancement PRs that add a new functionality to Drill performance PRs that Improve Performance labels Jul 11, 2025
Copy link
Contributor

@cgivre cgivre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shfshihuafeng
Thanks for this. I have a few minor nits but other than that it looks good.

}

@JsonIgnore
public int getMaxRecords() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again is there a reason for the JsonIgnore?

Copy link
Contributor Author

@shfshihuafeng shfshihuafeng Jul 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a necessary parameter to display in JSON, Do you suggest removing the annotation?

@shfshihuafeng shfshihuafeng force-pushed the hbase_limit_pushdown branch from b698849 to 4641849 Compare July 12, 2025 02:00
Copy link
Contributor

@cgivre cgivre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like I spoke too soon. The tests are failing for Java 17.

@shfshihuafeng
Copy link
Contributor Author

shfshihuafeng commented Jul 13, 2025

It looks like I spoke too soon. The tests are failing for Java 17.
@cgivre the test cases passed In my local environment for java 17,May be trigger bugs that are difficult to troubleshoot, I will reproduce this issue

image

@shfshihuafeng
Copy link
Contributor Author

shfshihuafeng commented Jul 13, 2025

It looks like I spoke too soon. The tests are failing for Java 17.

@cgivre The error is not caused by submitted code (hbase limit push down). I added my unit test cases to the original code( there is no hbase limit push down function). The plan shows that the limit has not been pushed down. but it still reports the same error which occurs occasionally.

image

@shfshihuafeng
Copy link
Contributor Author

It looks like I spoke too soon. The tests are failing for Java 17.

@cgivre Maybe it would be best to open a new JIRA for this error and add some unit test for offset

@cgivre
Copy link
Contributor

cgivre commented Jul 13, 2025

It looks like I spoke too soon. The tests are failing for Java 17.

@cgivre Maybe it would be best to open a new JIRA for this error and add some unit test for offset

I'm going to rerun the failing tests and see if that fixes it. Sometimes these tests are flaky and will randomly fail.

@shfshihuafeng shfshihuafeng force-pushed the hbase_limit_pushdown branch from e6ddeac to 77965e7 Compare July 20, 2025 13:49
Copy link
Contributor

@cgivre cgivre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM +1. Thanks for submitting!

@cgivre cgivre merged commit 27f3359 into apache:master Jul 20, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement PRs that add a new functionality to Drill performance PRs that Improve Performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants