Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid array copy in generating process builder commands #5765

Closed
wants to merge 4 commits into from

Conversation

bowenliang123
Copy link
Contributor

🔍 Description

Issue References 🔗

As described.

Describe Your Solution 🔧

Currently, each process builder use ArrayBuffer.toArray for generating process builder commands which includes unnecessary array copy. Reuse the buffer and use Iteratable for commands.

Types of changes 🔖

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Test Plan 🧪

Behavior Without This Pull Request ⚰️

No behavior change.

Behavior With This Pull Request 🎉

No behavior change.

Related Unit Tests

No behavior change.


Checklists

📝 Author Self Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • This patch was not authored or co-authored using Generative Tooling

📝 Committer Pre-Merge Checklist

  • Pull request title is okay.
  • No license issues.
  • Milestone correctly set?
  • Test coverage is ok
  • Assignees are selected.
  • Minimum number of approvals
  • No changes are requested

Be nice. Be informative.

@bowenliang123 bowenliang123 changed the title Prevent arrary copy for generating process builder commands [MINOR] Prevent array copy for generating process builder commands Nov 24, 2023
@bowenliang123 bowenliang123 changed the title [MINOR] Prevent array copy for generating process builder commands [MINOR] Prevent array copy in generating process builder commands Nov 24, 2023
@bowenliang123 bowenliang123 marked this pull request as draft November 24, 2023 05:02
@bowenliang123 bowenliang123 changed the title [MINOR] Prevent array copy in generating process builder commands [Improvement] Avoid array copy in generating process builder commands Nov 24, 2023
@bowenliang123 bowenliang123 marked this pull request as ready for review November 24, 2023 05:39
@@ -142,7 +142,7 @@ trait ProcBuilder {
}

final lazy val processBuilder: ProcessBuilder = {
val pb = new ProcessBuilder(commands: _*)
val pb = new ProcessBuilder(commands.toStream.asJava)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this introduce the overhead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

toStream uses a iterator, which prevents array copy to a list.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it requires to be converted to Array eventually, so early transformation should be good too. would prefer to keep as-is

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I got your point. Closing this PR.

@pan3793
Copy link
Member

pan3793 commented Nov 24, 2023

A general principle: for performance in-sensitive code path, readability is the most important thing

@codecov-commenter
Copy link

codecov-commenter commented Nov 24, 2023

Codecov Report

Attention: 4 lines in your changes are missing coverage. Please review.

Comparison is base (4d48796) 61.39% compared to head (8ece849) 61.37%.
Report is 4 commits behind head on master.

Files Patch % Lines
...apache/kyuubi/engine/chat/ChatProcessBuilder.scala 0.00% 2 Missing ⚠️
...kyuubi/engine/spark/SparkBatchProcessBuilder.scala 50.00% 0 Missing and 1 partial ⚠️
...ache/kyuubi/engine/spark/SparkProcessBuilder.scala 50.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master    #5765      +/-   ##
============================================
- Coverage     61.39%   61.37%   -0.03%     
  Complexity       23       23              
============================================
  Files           607      607              
  Lines         35905    35920      +15     
  Branches       4925     4927       +2     
============================================
+ Hits          22043    22045       +2     
- Misses        11478    11491      +13     
  Partials       2384     2384              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@bowenliang123 bowenliang123 deleted the commands-iter branch November 24, 2023 06:44
@bowenliang123 bowenliang123 restored the commands-iter branch November 24, 2023 07:10
@bowenliang123 bowenliang123 reopened this Nov 24, 2023
@bowenliang123
Copy link
Contributor Author

bowenliang123 commented Nov 24, 2023

Reopened this PR after offline discussions.

Previously: ArrayBuffer - copyTo Array-> _* -> copyTo ArrayList (when constructing ProcessBuilder)
With PR: ArrayBuffer - asStream-> toIterator -> construct with List (when constructing ProcessBuilder)

This PR helps to prevent first stage ArrayBuffer - copyTo Array array copying.

@pan3793 pan3793 changed the title [Improvement] Avoid array copy in generating process builder commands Avoid array copy in generating process builder commands Nov 27, 2023
@bowenliang123 bowenliang123 self-assigned this Nov 28, 2023
@bowenliang123 bowenliang123 added this to the v1.9.0 milestone Nov 28, 2023
@bowenliang123
Copy link
Contributor Author

Thanks, merged to master.

@bowenliang123 bowenliang123 deleted the commands-iter branch November 28, 2023 03:14
bowenliang123 added a commit that referenced this pull request Dec 1, 2023
# 🔍 Description
## Issue References 🔗

As described.

## Describe Your Solution 🔧

Currently, each process builder use `ArrayBuffer.toArray` for generating process builder commands which includes unnecessary array copy. Reuse the buffer and use `Iteratable` for commands.

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️
No behavior change.

#### Behavior With This Pull Request 🎉
No behavior change.

#### Related Unit Tests
No behavior change.

---

# Checklists
## 📝 Author Self Checklist

- [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project
- [x] I have performed a self-review
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my feature works
- [ ] New and existing unit tests pass locally with my changes
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

## 📝 Committer Pre-Merge Checklist

- [ ] Pull request title is okay.
- [ ] No license issues.
- [ ] Milestone correctly set?
- [ ] Test coverage is ok
- [ ] Assignees are selected.
- [ ] Minimum number of approvals
- [ ] No changes are requested

**Be nice. Be informative.**

Closes #5765 from bowenliang123/commands-iter.

Closes #5765

8ece849 [Bowen Liang] update
0702a6a [Bowen Liang] style
e390809 [Bowen Liang] fix processBuilder
d2d0fe2 [Bowen Liang] use Iterable as data type of process builder's commands

Authored-by: Bowen Liang <liangbowen@gf.com.cn>
Signed-off-by: Bowen Liang <liangbowen@gf.com.cn>
(cherry picked from commit f3a621c)
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
@bowenliang123 bowenliang123 modified the milestones: v1.9.0, v1.8.1 Dec 1, 2023
pan3793 added a commit that referenced this pull request Feb 22, 2024
…se notes

# 🔍 Description
## Issue References 🔗

Currently, we use a rather primitive way to manually write release notes from scratch, and some of the mechanical and repetitive work can be simplified by the scripts.

## Describe Your Solution 🔧

Adds a script to simplify the process of creating release notes.

Note: it just simplifies some processes, the release manager still needs to tune the outputs by hand.

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

```
RELEASE_TAG=v1.8.1 PREVIOUS_RELEASE_TAG=v1.8.0 build/release/pre_gen_release_notes.py
```

```
$ head build/release/commits-v1.8.1.txt
[KYUUBI #5981] Deploy Spark Hive connector with Scala 2.13 to Maven Central
[KYUUBI #6058] Make Jetty server stop timeout configurable
[KYUUBI #5952][1.8] Disconnect connections without running operations after engine maxlife time graceful period
[KYUUBI #6048] Assign serviceNode and add volatile for variables
[KYUUBI #5991] Error on reading Atlas properties composed of multi values
[KYUUBI #6045] [REST] Sync the AdminRestApi with the AdminResource Apis
[KYUUBI #6047] [CI] Free up disk space
[KYUUBI #6036] JDBC driver conditional sets fetchSize on opening session
[KYUUBI #6028] Exited spark-submit process should not block batch submit queue
[KYUUBI #6018] Speed up GetTables operation for Spark session catalog
```

```
$ head build/release/contributors-v1.8.1.txt
* Shaoyun Chen        -- [KYUUBI #5857][KYUUBI #5720][KYUUBI #5785][KYUUBI #5617]
* Chao Chen           -- [KYUUBI #5750]
* Flyangz             -- [KYUUBI #5832]
* Pengqi Li           -- [KYUUBI #5713]
* Bowen Liang         -- [KYUUBI #5730][KYUUBI #5802][KYUUBI #5767][KYUUBI #5831][KYUUBI #5801][KYUUBI #5754][KYUUBI #5626][KYUUBI #5811][KYUUBI #5853][KYUUBI #5765]
* Paul Lin            -- [KYUUBI #5799][KYUUBI #5814]
* Senmiao Liu         -- [KYUUBI #5969][KYUUBI #5244]
* Xiao Liu            -- [KYUUBI #5962]
* Peiyue Liu          -- [KYUUBI #5331]
* Junjie Ma           -- [KYUUBI #5789]
```
---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6074 from pan3793/release-script.

Closes #6074

3d5ec20 [Cheng Pan] credits
1765279 [Cheng Pan] Add a script to simplify the process of creating release notes

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
zhaohehuhu pushed a commit to zhaohehuhu/incubator-kyuubi that referenced this pull request Mar 21, 2024
… release notes

# 🔍 Description
## Issue References 🔗

Currently, we use a rather primitive way to manually write release notes from scratch, and some of the mechanical and repetitive work can be simplified by the scripts.

## Describe Your Solution 🔧

Adds a script to simplify the process of creating release notes.

Note: it just simplifies some processes, the release manager still needs to tune the outputs by hand.

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

```
RELEASE_TAG=v1.8.1 PREVIOUS_RELEASE_TAG=v1.8.0 build/release/pre_gen_release_notes.py
```

```
$ head build/release/commits-v1.8.1.txt
[KYUUBI apache#5981] Deploy Spark Hive connector with Scala 2.13 to Maven Central
[KYUUBI apache#6058] Make Jetty server stop timeout configurable
[KYUUBI apache#5952][1.8] Disconnect connections without running operations after engine maxlife time graceful period
[KYUUBI apache#6048] Assign serviceNode and add volatile for variables
[KYUUBI apache#5991] Error on reading Atlas properties composed of multi values
[KYUUBI apache#6045] [REST] Sync the AdminRestApi with the AdminResource Apis
[KYUUBI apache#6047] [CI] Free up disk space
[KYUUBI apache#6036] JDBC driver conditional sets fetchSize on opening session
[KYUUBI apache#6028] Exited spark-submit process should not block batch submit queue
[KYUUBI apache#6018] Speed up GetTables operation for Spark session catalog
```

```
$ head build/release/contributors-v1.8.1.txt
* Shaoyun Chen        -- [KYUUBI apache#5857][KYUUBI apache#5720][KYUUBI apache#5785][KYUUBI apache#5617]
* Chao Chen           -- [KYUUBI apache#5750]
* Flyangz             -- [KYUUBI apache#5832]
* Pengqi Li           -- [KYUUBI apache#5713]
* Bowen Liang         -- [KYUUBI apache#5730][KYUUBI apache#5802][KYUUBI apache#5767][KYUUBI apache#5831][KYUUBI apache#5801][KYUUBI apache#5754][KYUUBI apache#5626][KYUUBI apache#5811][KYUUBI apache#5853][KYUUBI apache#5765]
* Paul Lin            -- [KYUUBI apache#5799][KYUUBI apache#5814]
* Senmiao Liu         -- [KYUUBI apache#5969][KYUUBI apache#5244]
* Xiao Liu            -- [KYUUBI apache#5962]
* Peiyue Liu          -- [KYUUBI apache#5331]
* Junjie Ma           -- [KYUUBI apache#5789]
```
---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes apache#6074 from pan3793/release-script.

Closes apache#6074

3d5ec20 [Cheng Pan] credits
1765279 [Cheng Pan] Add a script to simplify the process of creating release notes

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants