[SPARK-45536][BUILD] Lower the default `-Xmx` of `build/mvn` to 3g #43364

LuciferYang · 2023-10-13T08:00:51Z

What changes were proposed in this pull request?

This pr lower the default -Xmx of build/mvn from 4g to 3g to reduce the peak memory usage of Maven compilation.

Why are the changes needed?

This can potentially fix the snapshot build being failed: https://github.com/apache/spark/actions/runs/6502277099

Does this PR introduce any user-facing change?

No

How was this patch tested?

Manual check.

run

build/mvn clean install -DskipTests -Pyarn -Pkubernetes -Pvolcano -Phive -Phive-thriftserver -Pspark-ganglia-lgpl -Phadoop-cloud

Before

Peak memory usage is at 6.1GB.

After

Peak memory usage is at 5GB, but the compilation time has increased by 10%.

Was this patch authored or co-authored using generative AI tooling?

No

LuciferYang · 2023-10-13T08:07:54Z

@HyukjinKwon Is there a way to manually trigger Publish snapshot using this PR, or do we have to wait until it's merged to get validation?

HyukjinKwon · 2023-10-13T08:15:45Z

We can merge and try. I made another PR #43365 for a different apprach.

EnricoMi · 2023-10-13T08:58:10Z

@HyukjinKwon Is there a way to manually trigger Publish snapshot using this PR, ...

You could if you'd add this to the publish_snapshot.yml workflow:

on:
  workflow_dispatch

https://docs.github.com/en/actions/using-workflows/manually-running-a-workflow

beliefer · 2023-10-13T09:13:40Z

@LuciferYang I don't understand This can potentially fix the snapshot build being failed: https://github.com/apache/spark/actions/runs/6502277099. Could you give an explanation why you reduce the -Xmx?

EnricoMi · 2023-10-13T09:16:48Z

The virtual machine (not the JVM but the host) building the releases has 7GB memory. The build process uses 6.1GB memory. They suspect the build process is killed because it uses too much memory.

LuciferYang · 2023-10-13T09:17:10Z

@LuciferYang I don't understand This can potentially fix the snapshot build being failed: https://github.com/apache/spark/actions/runs/6502277099. Could you give an explanation why you reduce the -Xmx?

It's just a guess, based on historical experience, the compilation container being killed might be due to memory overuse(Java 17 seems to use more metaspace during maven build.), but I indeed don't have concrete evidence for this case. Do you have any better suggestions? @beliefer

HyukjinKwon · 2023-10-13T09:54:42Z

Let's just try. If it doesn't work we can revert

LuciferYang · 2023-10-13T14:54:02Z

[info] *** 1 TEST FAILED ***
[error] Failed tests:
[error] 	org.apache.spark.sql.kafka010.KafkaSourceStressSuite
[error] (sql-kafka-0-10 / Test / test) sbt.TestsFailedException: Tests unsuccessful
[error] Total time: 2132 s (35:32), completed Oct 13, 2023, 12:07:56 PM

Only KafkaSourceStressSuite test failed, this is a known flaky test

LuciferYang · 2023-10-13T14:57:18Z

Merge into master to observe the Publish Snapshot job, if it doesn't work, we can revert it tomorrow.

Thanks @HyukjinKwon @beliefer @EnricoMi

dongjoon-hyun · 2023-10-13T16:46:20Z

Thank you, @LuciferYang and all.

Since Java 17 JVM GC is different than the old ParallelGC, we can optimize further.

LuciferYang · 2023-10-14T02:48:42Z

@dongjoon-hyun

https://github.com/apache/spark/actions/runs/6514229181/job/17696846279

It seems to still not work. Do you have any ideas or suggestions for optimizing the compilation options?

LuciferYang · 2023-10-14T11:58:57Z

I tried to perform mvn deploy operation to the local nexus, and no failures occurred...

…to 3g" This reverts commit 3e2470d. ### What changes were proposed in this pull request? This pr revert change of #43364. ### Why are the changes needed? It seems to have no effect on fixing `Publish snapshot`, it still failed - https://github.com/apache/spark/actions/runs/6514229181/job/17696846279 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Pass GitHub Actions ### Was this patch authored or co-authored using generative AI tooling? No Closes #43372 from LuciferYang/revert-SPARK-45536. Authored-by: yangjie01 <yangjie01@baidu.com> Signed-off-by: yangjie01 <yangjie01@baidu.com>

EnricoMi · 2023-10-24T16:19:22Z

Another attempt to fix this in #43512 / SPARK-45651.

### What changes were proposed in this pull request? With a manual trigger, the workflow can be executed manually after merging a fix of the workflow to master. This also allows to run the workflow only on a subset of branches (e.g. those that failed). ### Why are the changes needed? Sometime, publishing snapshots fails. If a fix of the workflow file is needed, that change can only be tested by waiting for the next day when the cron even triggers the next publish. This is quite a long turnaround to test fixes to that workflow (see #43364). ### Does this PR introduce _any_ user-facing change? No, this is purely build CI related. ### How was this patch tested? This can only be tested in master. Github workflow syntax tested in a private repo. ### Was this patch authored or co-authored using generative AI tooling? No Closes #43512 from EnricoMi/publish-snapshot-manually. Authored-by: Enrico Minack <github@enrico.minack.dev> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>

init

7d1fc36

github-actions bot added the BUILD label Oct 13, 2023

LuciferYang changed the title ~~Lower the default -Xmx of build/mvn to 3g.~~ Lower the default -Xmx of build/mvn to 3g Oct 13, 2023

LuciferYang changed the title ~~Lower the default -Xmx of build/mvn to 3g~~ [BUILD] Lower the default -Xmx of build/mvn to 3g Oct 13, 2023

LuciferYang changed the title ~~[BUILD] Lower the default -Xmx of build/mvn to 3g~~ [SPARK-45536][BUILD] Lower the default -Xmx of build/mvn to 3g Oct 13, 2023

HyukjinKwon approved these changes Oct 13, 2023

View reviewed changes

LuciferYang closed this in 3e2470d Oct 13, 2023

LuciferYang mentioned this pull request Oct 14, 2023

Revert "[SPARK-45536][BUILD] Lower the default -Xmx of build/mvn to 3g" #43372

Closed

LuciferYang deleted the r-xmx-3g branch October 18, 2023 05:23

EnricoMi mentioned this pull request Oct 24, 2023

[SPARK-45651][BUILD] Publish snapshot manually #43512

Closed

EnricoMi mentioned this pull request Oct 26, 2023

[SPARK-45651][Build] Log memory usage of publish snapshot workflow #43513

Closed

[SPARK-45536][BUILD] Lower the default -Xmx of build/mvn to 3g #43364

[SPARK-45536][BUILD] Lower the default -Xmx of build/mvn to 3g #43364

Uh oh!

Conversation

LuciferYang commented Oct 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

LuciferYang commented Oct 13, 2023

Uh oh!

HyukjinKwon commented Oct 13, 2023

Uh oh!

EnricoMi commented Oct 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

beliefer commented Oct 13, 2023

Uh oh!

EnricoMi commented Oct 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LuciferYang commented Oct 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HyukjinKwon commented Oct 13, 2023

Uh oh!

LuciferYang commented Oct 13, 2023

Uh oh!

LuciferYang commented Oct 13, 2023

Uh oh!

dongjoon-hyun commented Oct 13, 2023

Uh oh!

LuciferYang commented Oct 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LuciferYang commented Oct 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

EnricoMi commented Oct 24, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[SPARK-45536][BUILD] Lower the default `-Xmx` of `build/mvn` to 3g #43364

[SPARK-45536][BUILD] Lower the default `-Xmx` of `build/mvn` to 3g #43364

LuciferYang commented Oct 13, 2023 •

edited

Loading

EnricoMi commented Oct 13, 2023 •

edited

Loading

EnricoMi commented Oct 13, 2023 •

edited

Loading

LuciferYang commented Oct 13, 2023 •

edited

Loading

LuciferYang commented Oct 14, 2023 •

edited

Loading

LuciferYang commented Oct 14, 2023 •

edited

Loading