Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[KYUUBI #6108] Display the CPU time consumed by the statement in the Spark Engine tab #6113

Closed

Conversation

XorSum
Copy link
Contributor

@XorSum XorSum commented Feb 28, 2024

🔍 Description

Issue References 🔗

This pull request fixes #6108

Describe Your Solution 🔧

Follow #6112

Display the run time and CPU time of statements or sessions in the Spark UI.

screenshot 2024-03-05 10 37 26

Types of changes 🔖

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Test Plan 🧪

Behavior Without This Pull Request ⚰️

Behavior With This Pull Request 🎉

Related Unit Tests


Checklist 📝

Be nice. Be informative.

@pan3793 pan3793 changed the title [KYUUBI #6108][FEATURE][SPARK] Display the CPU time consumed by the statement in the Engine tab [KYUUBI #6108] Display the CPU time consumed by the statement in the Spark Engine tab Feb 28, 2024
@codecov-commenter
Copy link

codecov-commenter commented Feb 29, 2024

Codecov Report

Attention: Patch coverage is 76.47059% with 4 lines in your changes are missing coverage. Please review.

Project coverage is 61.14%. Comparing base (3bc28fd) to head (aad2bbb).
Report is 3 commits behind head on master.

Files Patch % Lines
...rc/main/scala/org/apache/spark/ui/EnginePage.scala 66.66% 4 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master    #6113      +/-   ##
============================================
+ Coverage     61.11%   61.14%   +0.03%     
  Complexity       23       23              
============================================
  Files           624      624              
  Lines         37245    37260      +15     
  Branches       5041     5046       +5     
============================================
+ Hits          22762    22784      +22     
+ Misses        12022    12015       -7     
  Partials       2461     2461              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@cxzl25 cxzl25 added this to the v1.9.0 milestone Feb 29, 2024
@wForget
Copy link
Member

wForget commented Mar 1, 2024

@XorSum This PR seems to have many duplicates with #6112. Which one should we use?

@XorSum
Copy link
Contributor Author

XorSum commented Mar 1, 2024

@XorSum This PR seems to have many duplicates with #6112. Which one should we use?

Both.
#6112 collects the CPU time of operations, and #6113 displays the CPU time in Spark UI.
#6113 should rebase onto #6112.

@wForget
Copy link
Member

wForget commented Mar 1, 2024

@XorSum This PR seems to have many duplicates with #6112. Which one should we use?

Both. #6112 collects the CPU time of operations, and #6113 displays the CPU time in Spark UI. #6113 should rebase onto #6112.

Got it, you may need to rebase this change after #6112 is merged

@XorSum XorSum force-pushed the features/spark-engine-cpu-time-UI branch from 9b2df84 to 9018a07 Compare March 4, 2024 03:50
@XorSum
Copy link
Contributor Author

XorSum commented Mar 4, 2024

Got it, you may need to rebase this change after #6112 is merged

rebased

@wForget
Copy link
Member

wForget commented Mar 4, 2024

Should we show sessionRunTime instead of sessionCpuTime?

  • sessionRunTime: The total running time of all tasks
  • sessionCpuTime: The total cpu time occupied by all task threads (exclude io and other asynchronous operations?)

cc @pan3793 @yaooqinn @cxzl25

@cxzl25
Copy link
Contributor

cxzl25 commented Mar 4, 2024

Should we show sessionRunTime instead of sessionCpuTime?

Maybe we can display both values, they are two dimensions of data.

@wForget
Copy link
Member

wForget commented Mar 4, 2024

Maybe we can display both values, they are two dimensions of data.

SGTM, @XorSum what do you think?

@XorSum
Copy link
Contributor Author

XorSum commented Mar 4, 2024

Maybe we can display both values, they are two dimensions of data.

SGTM, @XorSum what do you think?

SGTM

@cxzl25
Copy link
Contributor

cxzl25 commented Mar 5, 2024

Thank you for your contribution.
Tested locally, it looks good, but we need to update the run time of the unclosed session immediately.

image

We can just merge the two methods into one by the way.

org.apache.kyuubi.engine.spark.session.SparkSessionImpl#increaseRunTime
org.apache.kyuubi.engine.spark.session.SparkSessionImpl#increaseCpuTime

@XorSum
Copy link
Contributor Author

XorSum commented Mar 5, 2024

@cxzl25 done.
截屏2024-03-05 13 58 02

@cxzl25 cxzl25 closed this in 49456bb Mar 5, 2024
@cxzl25
Copy link
Contributor

cxzl25 commented Mar 5, 2024

Thanks, merged to master.

zhaohehuhu pushed a commit to zhaohehuhu/incubator-kyuubi that referenced this pull request Mar 21, 2024
…n the Spark Engine tab

# 🔍 Description
## Issue References 🔗

This pull request fixes apache#6108

## Describe Your Solution 🔧

Follow apache#6112

Display the run time and CPU time of statements or sessions in the Spark UI.

<img width="1429" alt="screenshot 2024-03-05 10 37 26" src="https://github.com/apache/kyuubi/assets/23011702/337772e0-a681-4989-b6f9-ee3633bb6287">

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️

#### Behavior With This Pull Request 🎉

#### Related Unit Tests

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes apache#6113 from XorSum/features/spark-engine-cpu-time-UI.

Closes apache#6108

aad2bbb [bkhan] session immediately
8e957c7 [bkhan] display run time
9018a07 [bkhan] Apply suggestions from code review
8523364 [bkhan] Display the CPU time consumed by the statement in the Engine tab

Authored-by: bkhan <bkhan@trip.com>
Signed-off-by: Shaoyun Chen <csy@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE][SPARK] Display the CPU time consumed by the statement in the Engine tab
5 participants