[KYUUBI #6661] Improve perf for column-based TRowSet generation #6662

hh-cn · 2024-09-03T06:43:23Z

🔍 Description

Issue References 🔗

This pull request fixes #6661

Describe Your Solution 🔧

TColumnGenerator.getColumnToList should not access to non-IndexedSeq with Seq.apply(i), which will cause performance reduce, convert it to foreach loop will be good. see https://issues.apache.org/jira/browse/SPARK-47085 for more details.

Types of changes 🔖

Bugfix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Test Plan 🧪

Behavior Without This Pull Request ⚰️

Behavior With This Pull Request 🎉

Related Unit Tests

Checklist 📝

This patch was not authored or co-authored using Generative Tooling

Be nice. Be informative.

pan3793 · 2024-09-03T11:37:47Z

Do you have any statistics to measure the performance improvements? And have you compared the patched version with Spark Thrift Server?

pan3793 · 2024-09-03T11:48:25Z

I see the Spark ticket you attached on the issue, and understand your change now.

The code change LGTM, please fill in the PR description seriously, it's very important for future explorers to understand each patch.

codecov-commenter · 2024-09-03T12:11:38Z

Codecov Report

Attention: Patch coverage is 0% with 1 line in your changes missing coverage. Please review.

Project coverage is 0.00%. Comparing base (9533c5a) to head (4597e88).
Report is 1 commits behind head on master.

Files with missing lines	Patch %	Lines
...apache/kyuubi/engine/result/TColumnGenerator.scala	0.00%	1 Missing ⚠️

Additional details and impacted files

@@          Coverage Diff           @@
##           master   #6662   +/-   ##
======================================
  Coverage    0.00%   0.00%           
======================================
  Files         683     683           
  Lines       42205   42204    -1     
  Branches     5756    5755    -1     
======================================
+ Misses      42205   42204    -1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

bowenliang123 · 2024-09-03T14:41:40Z

BTW, any benchmark results on the same conditions for comparing looping approaches to support the supposed changes?

hh-cn · 2024-09-04T10:09:07Z

No benchmark yet

hh-cn · 2024-09-04T10:13:38Z

@pan3793 the description has been updated, my apologies to miss that. Can this be merged and close it now?

# 🔍 Description ## Issue References 🔗 This pull request fixes #6661 ## Describe Your Solution 🔧 TColumnGenerator.getColumnToList should not access to non-IndexedSeq with Seq.apply(i), which will cause performance reduce, convert it to foreach loop will be good. see https://issues.apache.org/jira/browse/SPARK-47085 for more details. ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklist 📝 - [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #6662 from hh-cn/KYUUBI-6661. Closes #6661 4597e88 [hang.huang] improve column-based TRowSet generation Authored-by: hang.huang <hang.huang@advancegroup.com> Signed-off-by: Bowen Liang <liangbowen@gf.com.cn> (cherry picked from commit 14e07ea) Signed-off-by: Bowen Liang <liangbowen@gf.com.cn>

bowenliang123 · 2024-09-04T14:49:38Z

Thanks, merged to master (1.10.0) and branch-1.9 (1.9.3).

improve column-based TRowSet generation

4597e88

github-actions bot added the module:common label Sep 3, 2024

pan3793 changed the title ~~[KYUUBI #6661] improve column-based TRowSet generation~~ [KYUUBI #6661] Improve perf for column-based TRowSet generation Sep 3, 2024

pan3793 approved these changes Sep 3, 2024

View reviewed changes

bowenliang123 approved these changes Sep 3, 2024

View reviewed changes

cxzl25 approved these changes Sep 4, 2024

View reviewed changes

bowenliang123 assigned hh-cn Sep 4, 2024

bowenliang123 added this to the v1.9.3 milestone Sep 4, 2024

bowenliang123 closed this in 14e07ea Sep 4, 2024

hh-cn deleted the KYUUBI-6661 branch September 5, 2024 02:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[KYUUBI #6661] Improve perf for column-based TRowSet generation #6662

[KYUUBI #6661] Improve perf for column-based TRowSet generation #6662

hh-cn commented Sep 3, 2024 •

edited

Loading

pan3793 commented Sep 3, 2024

pan3793 commented Sep 3, 2024

codecov-commenter commented Sep 3, 2024 •

edited

Loading

bowenliang123 commented Sep 3, 2024 •

edited

Loading

hh-cn commented Sep 4, 2024

hh-cn commented Sep 4, 2024

bowenliang123 commented Sep 4, 2024

[KYUUBI #6661] Improve perf for column-based TRowSet generation #6662

[KYUUBI #6661] Improve perf for column-based TRowSet generation #6662

Conversation

hh-cn commented Sep 3, 2024 • edited Loading

🔍 Description

Issue References 🔗

Describe Your Solution 🔧

Types of changes 🔖

Test Plan 🧪

Behavior Without This Pull Request ⚰️

Behavior With This Pull Request 🎉

Related Unit Tests

Checklist 📝

pan3793 commented Sep 3, 2024

pan3793 commented Sep 3, 2024

codecov-commenter commented Sep 3, 2024 • edited Loading

Codecov Report

bowenliang123 commented Sep 3, 2024 • edited Loading

hh-cn commented Sep 4, 2024

hh-cn commented Sep 4, 2024

bowenliang123 commented Sep 4, 2024

hh-cn commented Sep 3, 2024 •

edited

Loading

codecov-commenter commented Sep 3, 2024 •

edited

Loading

bowenliang123 commented Sep 3, 2024 •

edited

Loading