Skip to content

[SPARK-33655][SQL] Improve performance of processing FETCH_PRIOR #30600

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

Dooyoung-Hwang
Copy link
Contributor

What changes were proposed in this pull request?

Currently, when a client requests FETCH_PRIOR to Thriftserver, Thriftserver reiterates from the start position. Because Thriftserver caches a query result with an array when THRIFTSERVER_INCREMENTAL_COLLECT feature is off, FETCH_PRIOR can be implemented without reiterating the result. A trait FeatureIterator is added in order to separate the implementation for iterator and an array. Also, FeatureIterator supports moves cursor with absolute position, which will be useful for the implementation of FETCH_RELATIVE, FETCH_ABSOLUTE.

Why are the changes needed?

For better performance of Thriftserver.

Does this PR introduce any user-facing change?

No

How was this patch tested?

FetchIteratorSuite

@Dooyoung-Hwang
Copy link
Contributor Author

@juliuszsompolski
I created PR.

cc : @HyukjinKwon , @dongjoon-hyun , @cloud-fan

@HyukjinKwon
Copy link
Member

ok to test

@HyukjinKwon
Copy link
Member

cc @wangyum as well

@github-actions github-actions bot added the SQL label Dec 4, 2020
@SparkQA
Copy link

SparkQA commented Dec 4, 2020

Test build #132179 has finished for PR 30600 at commit b490f49.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 4, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36781/

@SparkQA
Copy link

SparkQA commented Dec 4, 2020

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36781/

@Dooyoung-Hwang
Copy link
Contributor Author

Jenkins, retest this please

@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Dec 4, 2020

Test build #132191 has finished for PR 30600 at commit b490f49.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 4, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36793/

@SparkQA
Copy link

SparkQA commented Dec 4, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36793/

Dooyoung Hwang added 4 commits December 4, 2020 20:41
@Dooyoung-Hwang Dooyoung-Hwang force-pushed the refactor_with_fetch_iterator branch from b490f49 to 2b1b700 Compare December 4, 2020 11:42
@SparkQA
Copy link

SparkQA commented Dec 4, 2020

Test build #132228 has finished for PR 30600 at commit 2b1b700.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 4, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36828/

@SparkQA
Copy link

SparkQA commented Dec 4, 2020

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36828/

Copy link
Contributor

@juliuszsompolski juliuszsompolski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.
Thanks @Dooyoung-Hwang

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me too. @wangyum can you do a final check when you find some time?

Comment on lines 145 to 147
if (order.equals(FetchOrientation.FETCH_FIRST)) iter.fetchAbsolute(0)
else if (order.equals(FetchOrientation.FETCH_PRIOR)) iter.fetchPrior(maxRowsL)
else iter.fetchNext()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- Add curly braces for if-else
@SparkQA
Copy link

SparkQA commented Dec 7, 2020

Test build #132328 has finished for PR 30600 at commit fb93562.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum
Copy link
Member

wangyum commented Dec 7, 2020

retest this please.

@SparkQA
Copy link

SparkQA commented Dec 7, 2020

Test build #132368 has finished for PR 30600 at commit fb93562.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 7, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36968/

@SparkQA
Copy link

SparkQA commented Dec 7, 2020

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36968/

@HyukjinKwon
Copy link
Member

Merged to master.

@Dooyoung-Hwang
Copy link
Contributor Author

Thanks!!

yaooqinn pushed a commit to apache/kyuubi that referenced this pull request Feb 25, 2021
![pan3793](https://badgen.net/badge/Hello/pan3793/green) [![Closes #370](https://badgen.net/badge/Preview/Closes%20%23370/blue)](https://github.com/yaooqinn/kyuubi/pull/370) ![332](https://badgen.net/badge/%2B/332/red) ![24](https://badgen.net/badge/-/24/green) ![8](https://badgen.net/badge/commits/8/yellow) ![Feature](https://badgen.net/badge/Label/Feature/) ![Bug](https://badgen.net/badge/Label/Bug/) [&#10088;?&#10089;](https://pullrequestbadge.com/?utm_medium=github&utm_source=yaooqinn&utm_campaign=badge_info)<!-- PR-BADGE: PLEASE DO NOT REMOVE THIS COMMENT -->

<!--
Thanks for sending a pull request!

Here are some tips for you:
  1. If this is your first time, please read our contributor guidelines: https://kyuubi.readthedocs.io/en/latest/community/contributions.html
  2. If the PR is related to an issue in https://github.com/yaooqinn/kyuubi/issues, add '[KYUUBI #XXXX]' in your PR title, e.g., '[KYUUBI #XXXX] Your PR title ...'.
  3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][KYUUBI #XXXX] Your PR title ...'.
-->

### _Why are the changes needed?_
<!--
Please clarify why the changes are needed. For instance,
  1. If you add a feature, you can talk about the use case of it.
  2. If you fix a bug, you can clarify why it is a bug.
-->
close #360

Ref: apache/spark#30600

### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/latest/tools/testing.html#running-tests) locally before make a pull request

Closes #370 from pan3793/KYUUBI-360.

e79b8cb [Cheng Pan] [KYUUBI #360] comments
0fae3db [Cheng Pan] fix import
3d1b2a6 [Cheng Pan] [KYUUBI #360] fix ut
eda3e59 [Cheng Pan] [KYUUBI #360] fix import
16178d6 [Cheng Pan] [KYUUBI #360] ut
179404d [Cheng Pan] [KYUUBI #360] nit
455af6b [Cheng Pan] [KYUUBI #360] correct getNextRowSet with FETCH_PRIOR FETCH_FIRST
2307f1f [Cheng Pan] [KYUUBI #360] move ThriftUtils to kyuubi-common

Authored-by: Cheng Pan <379377944@qq.com>
Signed-off-by: Kent Yao <yao@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants