-
Notifications
You must be signed in to change notification settings - Fork 28.7k
[SPARK-34399][SQL] Add commit duration to SQL tab's graph node. #31522
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Test build #135016 has finished for PR 31522 at commit
|
Kubernetes integration test starting |
Kubernetes integration test status success |
Test build #135066 has finished for PR 31522 at commit
|
Gentle ping @HeartSaVioR @dongjoon-hyun @HyukjinKwon @maropu @cloud-fan Could you help to review this I think it's really help since always |
Kubernetes integration test starting |
Kubernetes integration test status success |
Test build #135289 has finished for PR 31522 at commit
|
The file commit is a driver side thing, why do we need to update |
Since we compute DataWritingCommand's metrics in driver side and all metrics stored in If we want to directly update this value, we need to add it in spark/sql/core/src/main/scala/org/apache/spark/sql/execution/command/DataWritingCommand.scala Line 50 in 612d523
Then pass the |
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/BasicWriteStatsTracker.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/BasicWriteStatsTracker.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/BasicWriteStatsTracker.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/BasicWriteStatsTracker.scala
Outdated
Show resolved
Hide resolved
Kubernetes integration test starting |
Kubernetes integration test status success |
Test build #141422 has finished for PR 31522 at commit
|
Test build #141425 has finished for PR 31522 at commit
|
NUM_PARTS_KEY -> SQLMetrics.createMetric(sparkContext, "number of dynamic part") | ||
NUM_PARTS_KEY -> SQLMetrics.createMetric(sparkContext, "number of dynamic part"), | ||
TASK_COMMIT_TIME -> | ||
SQLMetrics.createTimingMetric(sparkContext, "time of committing the tasks"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make it task commit time
for short
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
NUM_PARTS_KEY -> SQLMetrics.createMetric(sparkContext, "number of dynamic part"), | ||
TASK_COMMIT_TIME -> | ||
SQLMetrics.createTimingMetric(sparkContext, "time of committing the tasks"), | ||
JOB_COMMIT_TIME -> SQLMetrics.createTimingMetric(sparkContext, "time of committing the job") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make it job commit time
for short
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM otherwise
@AngersZhuuuu Thanks. Please update the screenshot in the PR description as well. |
Test build #141495 has finished for PR 31522 at commit
|
DOne |
retest this please |
Test build #141501 has started for PR 31522 at commit |
Kubernetes integration test starting |
Kubernetes integration test starting |
Kubernetes integration test status success |
Kubernetes integration test status success |
Retest this please |
Kubernetes integration test starting |
Kubernetes integration test status success |
Test build #141517 has finished for PR 31522 at commit
|
thanks, merging to master/3.2 (as it replaces the log we added in 3.2: #33279) |
oh it has conflicts. @AngersZhuuuu can you open a backport PR? thanks! |
…b commit time ### What changes were proposed in this pull request? This is follow-up of #31522. It adds docs for the new metrics of task/job commit time ### Why are the changes needed? So that users can understand the metrics better and know that the new metrics are only for file table writes. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Build docs and preview:  Closes #33542 from gengliangwang/addDocForMetrics. Authored-by: Gengliang Wang <gengliang@apache.org> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…b commit time ### What changes were proposed in this pull request? This is follow-up of #31522. It adds docs for the new metrics of task/job commit time ### Why are the changes needed? So that users can understand the metrics better and know that the new metrics are only for file table writes. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Build docs and preview:  Closes #33542 from gengliangwang/addDocForMetrics. Authored-by: Gengliang Wang <gengliang@apache.org> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit c9a7ff3) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Since we have add log about commit time, I think this useful and we can make user know it directly in SQL tab's UI.  Make user can directly know commit duration. User can see file commit duration in SQL tab's SQL plan graph Mannul tested Closes apache#31522 from AngersZhuuuu/SPARK-34399. Lead-authored-by: Angerszhuuuu <angers.zhu@gmail.com> Co-authored-by: AngersZhuuuu <angers.zhu@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
@@ -788,6 +792,24 @@ class SQLMetricsSuite extends SharedSparkSession with SQLMetricsTestUtils | |||
} | |||
} | |||
|
|||
test("SPARK-34399: Add job commit duration metrics for DataWritingCommand") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just happened to see the flakiness of the PR here: https://github.com/apache/spark/runs/4378978842
sbt.ForkMain$ForkError: org.scalatest.exceptions.TestFailedException: 0 was not greater than 0
at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:472)
at org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:471)
at org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1231)
at org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:1295)
at org.apache.spark.sql.execution.metric.SQLMetricsSuite.$anonfun$new$87(SQLMetricsSuite.scala:810)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462)
at org.apache.spark.sql.test.SQLTestUtilsBase.withTable(SQLTestUtils.scala:305)
at org.apache.spark.sql.test.SQLTestUtilsBase.withTable$(SQLTestUtils.scala:303)
at org.apache.spark.sql.execution.metric.SQLMetricsSuite.withTable(SQLMetricsSuite.scala:44)
at org.apache.spark.sql.execution.metric.SQLMetricsSuite.$anonfun$new$86(SQLMetricsSuite.scala:800)
at org.apache.spark.sql.catalyst.plans.SQLHelper.withSQLConf(SQLHelper.scala:54)
at org.apache.spark.sql.catalyst.plans.SQLHelper.withSQLConf$(SQLHelper.scala:38)
at org.apache.spark.sql.execution.metric.SQLMetricsSuite.org$apache$spark$sql$test$SQLTestUtilsBase$$super$withSQLConf(SQLMetricsSuite.scala:44)
at org.apache.spark.sql.test.SQLTestUtilsBase.withSQLConf(SQLTestUtils.scala:246)
at org.apache.spark.sql.test.SQLTestUtilsBase.withSQLConf$(SQLTestUtils.scala:244)
at org.apache.spark.sql.execution.metric.SQLMetricsSuite.withSQLConf(SQLMetricsSuite.scala:44)
at org.apache.spark.sql.execution.metric.SQLMetricsSuite.$anonfun$new$85(SQLMetricsSuite.scala:800)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
at org.apache.spark.sql.execution.adaptive.DisableAdaptiveExecutionSuite.$anonfun$test$5(AdaptiveTestUtils.scala:65)
at org.apache.spark.sql.catalyst.plans.SQLHelper.withSQLConf(SQLHelper.scala:54)
at org.apache.spark.sql.catalyst.plans.SQLHelper.withSQLConf$(SQLHelper.scala:38)
at org.apache.spark.sql.execution.metric.SQLMetricsSuite.org$apache$spark$sql$test$SQLTestUtilsBase$$super$withSQLConf(SQLMetricsSuite.scala:44)
at org.apache.spark.sql.test.SQLTestUtilsBase.withSQLConf(SQLTestUtils.scala:246)
at org.apache.spark.sql.test.SQLTestUtilsBase.withSQLConf$(SQLTestUtils.scala:244)
at org.apache.spark.sql.execution.metric.SQLMetricsSuite.withSQLConf(SQLMetricsSuite.scala:44)
at org.apache.spark.sql.execution.adaptive.DisableAdaptiveExecutionSuite.$anonfun$test$4(AdaptiveTestUtils.scala:65)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
at org.scalatest.Transformer.apply(Transformer.scala:22)
at org.scalatest.Transformer.apply(Transformer.scala:20)
at org.scalatest.funsuite.AnyFunSuiteLike$$anon$1.apply(AnyFunSuiteLike.scala:226)
at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:190)
at org.scalatest.funsuite.AnyFunSuiteLike.invokeWithFixture$1(AnyFunSuiteLike.scala:224)
at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTest$1(AnyFunSuiteLike.scala:236)
at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
at org.scalatest.funsuite.AnyFunSuiteLike.runTest(AnyFunSuiteLike.scala:236)
at org.scalatest.funsuite.AnyFunSuiteLike.runTest$(AnyFunSuiteLike.scala:218)
at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(SparkFunSuite.scala:62)
at org.scalatest.BeforeAndAfterEach.runTest(BeforeAndAfterEach.scala:234)
at org.scalatest.BeforeAndAfterEach.runTest$(BeforeAndAfterEach.scala:227)
at org.apache.spark.SparkFunSuite.runTest(SparkFunSuite.scala:62)
at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTests$1(AnyFunSuiteLike.scala:269)
Seems not super flaky though. I am noting it here in case other people see this failure more.
What changes were proposed in this pull request?
Since we have add log about commit time, I think this useful and we can make user know it directly in SQL tab's UI.
Why are the changes needed?
Make user can directly know commit duration.
Does this PR introduce any user-facing change?
User can see file commit duration in SQL tab's SQL plan graph
How was this patch tested?
Mannul tested