-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-27900][K8s] Add jvm oom flag #25229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -65,6 +65,7 @@ case "$1" in | |||
shift 1 | |||
CMD=( | |||
"$SPARK_HOME/bin/spark-submit" | |||
--driver-java-options '-XX:OnOutOfMemoryError="kill -9 %p"' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are in client mode so best to use this arg. Also this is going to be parsed by the SparkSubmitCommandBuilder
later on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense - the parent process (think it's bin/spark-class
) will eventually route this as a JVM argument to the child driver process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we be adding this for executors also?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mccheah afaik executors have a handler for uncaught exceptions and afaik they will handle this and exit:
uncaughtExceptionHandler: UncaughtExceptionHandler = new SparkUncaughtExceptionHandler) |
Thread.setDefaultUncaughtExceptionHandler(uncaughtExceptionHandler) |
Test build #108014 has finished for PR 25229 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this? It seems that we can use spark.driver.extraJavaOptions
already.
we need to make sure its added by default and not rely to the user to pass it. If you mean whether to pass this as a spark conf using that property then in client mode this is the way to do it as according to the docs driver has already started. I digged into the code and verified that this is being properly picked up not sure about the extraJavaOptions, docs say not to use it. So this is a working option. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skonto . I really appreciate your active contributions in this area. If this is really needed, yes, we should have this.
However, we have an integration test for Run extraJVMOptions check on driver
and this PR already breaks it. That's the reason why I think this PR is not safe enough.
So this is a working option.
As we see in the integration test failure, sometimes, by default
isn't the best option. I worried about breaking the existing user workloads accidentally. We should not surprise the users.
Could you improve this PR in three directions?
- Pass the existing integration test by respecting
spark.driver.extraJavaOptions
. Please don't modify the existing test case. - Provide a way to override this PR's suggestion. (In other words, users should be able to override
OnOutOfMemoryError
option. Removing it or adding another behavior) - Add a new test coverage for (2). This will make your commit safe.
@dongjoon-hyun Ok I will improve the PR, I had in mind to test |
Thank you so much, @skonto ! |
@dongjoon-hyun I updated the PR.
I get the flag added by default (tested with
If I add
Right now this is a bit verbose solution but complete. The other alternatives I see are: |
Thank you for updating, @skonto . I've been waiting for that. Could you update the PR description, too? |
echo "$@" | ||
else | ||
if grep -q "spark.driver.extraJavaOptions" "/opt/spark/conf/spark.properties"; then | ||
sed 's/spark.driver.extraJavaOptions=/&-XX:OnOutOfMemoryError="kill -9 %p" /g' /opt/spark/conf/spark.properties > /tmp/spark.properties |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/opt/spark/conf/spark.properties
cannot be written in place as its a mounted config map.
case "$1" in | ||
driver) | ||
shift 1 | ||
DRIVER_ARGS=$(get_args_with_defaults "$@") | ||
VERBOSE_FLAG=$(get_verbose_flag) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use this flag so in the tests we can detect what properties we get. Also it it useful in general users want to debug Spark.
@@ -60,14 +60,44 @@ if ! [ -z ${HADOOP_CONF_DIR+x} ]; then | |||
SPARK_CLASSPATH="$HADOOP_CONF_DIR:$SPARK_CLASSPATH"; | |||
fi | |||
|
|||
IGNORE_DEFAULT_DRIVER_JVM_OPTIONS=${IGNORE_DEFAULT_DRIVER_JVM_OPTIONS:-false} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have DEFAULT_DRIVER_JVM_OPTIONS
instead of this flag?
- By default,
DEFAULT_DRIVER_JVM_OPTIONS=-XX:OnOutOfMemoryError="kill -9 %p"
and it will be appended beforespark.driver.extraJavaOptions
. - If users unset
DEFAULT_DRIVER_JVM_OPTIONS
, then onlyspark.driver.extraJavaOptions
works.
At (1), if spark.driver.extraJavaOptions
has -XX:OnOutOfMemoryError=
, it will supersede DEFAULT_DRIVER_JVM_OPTIONS
because the last one is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dongjoon-hyun How can they unset DEFAULT_DRIVER_JVM_OPTIONS
at runtime? Probably missing something here. Do you mean by setting the container env var eg. spark.kubernetes.driverEnv.DEFAULT_DRIVER_JVM_OPTIONS=""?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dongjoon-hyun the only option I see is setting DEFAULT_DRIVER_JVM_OPTIONS
to empty string to signal the script to use the default, which is similar to setting that flag to true. I will the rename the flag but not sure if we are aligned.
If users unset DEFAULT_DRIVER_JVM_OPTIONS
How should the user do that? Could you elaborate a bit more?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dongjoon-hyun gentle ping :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops. Sorry for being late, @skonto . I'll take a look this PR tomorrow again~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initially, what I thought was something like the following
DEFAULT_DRIVER_JVM_OPTIONS=${DEFAULT_DRIVER_JVM_OPTIONS:--XX:OnOutOfMemoryError="kill -9 %p"}
And users do export DEFAULT_DRIVER_JVM_OPTIONS=' '
for unset. Ya. My comment was unclear at that time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skonto . What do you think about the above? For me, the above looks more direct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dongjoon-hyun Ok so that is what I also concluded :) I can do that, it looks fine to me, the only issue is that this way we open the door to adding arbitrary options there. Initially that was what I didnt want to support, thus I used that the flag to make defaults unchangeable (defaults usually have fixed values).
But we could be flexible here for simplicity reasons, so I will change to what you suggest.
Kubernetes integration test starting |
Test build #108420 has finished for PR 25229 at commit
|
cp /opt/spark/conf/spark.properties /tmp/spark.properties | ||
echo 'spark.driver.extraJavaOptions=-XX:OnOutOfMemoryError="kill -9 %p"' >> /tmp/spark.properties | ||
fi | ||
echo "$@" | sed 's|/opt/spark/conf/spark.properties |/tmp/spark.properties |g' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot avoid sed if we want to make sure the default is always added. Otherwise if we just add spark.driver.extraJavaOptions
to CMD, user provided spark.driver.extraJavaOptions
will override it, as Spark conf has precedence rules in case the same property is used more than once (the last value is the one chosen).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you follow https://github.com/apache/spark/pull/25229/files#r310866032, the suggested behavior is the following. So, it seems that we don't need this complexity.
At (1), if spark.driver.extraJavaOptions has -XX:OnOutOfMemoryError=, it will supersede DEFAULT_DRIVER_JVM_OPTIONS because the last one is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok so we say that spark.driver.extraJavaOptions
overrides whatever the default is.
Kubernetes integration test status success |
Retest this please. |
Test build #108689 has finished for PR 25229 at commit
|
Kubernetes integration test starting |
"(spark.driver.extraJavaOptions,-XX:OnOutOfMemoryError=\"kill -9 %p\" " + | ||
"-Dspark.test.foo=spark.test.bar)") | ||
runSparkPiAndVerifyCompletion(expectedLogOnCompletion = output) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for adding these test cases.
Kubernetes integration test status success |
@dongjoon-hyun a clarification here pls:
Should user supplied I have updated the PR locally but waiting for this clarification. |
Hmm. I got it. That rang me my original question again. If the users supply
For me, the documentation update seems to be enough. How do you think about that? Did I miss something? |
@dongjoon-hyun that was one of my options. So this PR was intended to protect the user from the spark issues when things dont go well and the driver does not exit on K8s. Anyway some people asked this as well on the related discussion: #24796. I dont mind adding a note to the docs, what is the best UX? |
@skonto . My name is I knew the history of #24796 and Spark uses In general, I believe the best UX is to keep it simple by reusing the existing general one. Hi, @srowen , @squito , @mccheah , @zsxwing , @tgravescs . This is the continuation of #24796 . Could you review this PR's implementation in order to make a progress and finalize the issue? |
Sorry I am not sure I entirely follow the full discussion here. I think that I tend to agree with dongjoon, though -- can't users just set this themselves? I feel a bit torn here, as I do think we should try to have sensible defaults, and this does seem better. The driver jvm is always tricky, because spark is not entirely responsible for it -- eg. the user might be running a server which is doing all sorts of things, and it happens to use a spark context for servicing some requests. Not an architecture I'd recommend, and I dunno what else it could with an OOM anyway -- but still I hesitate to have spark be changing that. Then again, maybe spark 3.0 is a good chance for us to change this to a more sane default? anyway, I guess I'm not saying anything definitive yet, just thinking "out loud" ... another thing which may be relevant here is #24804 / SPARK-23472 -- it at least makes it easy for an admin to set this option across the board. |
ok sorry I refreshed my memory a bit more -- I think I lean towards including this by default. Yes, the user could do it themselves, but they are likely to forget, and the cases where they don't want it are very few. I agree with dongjoon, though, that this seems uglier than necessary -- probably easier if we add an option to change it for all cluster managers, and then handle it in spark/launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java Lines 268 to 273 in a59fdc4
|
Thank you for your advice, @squito . +1 for adding this for all cluster managers. |
@dongjoon-hyun sorry for the name spelling thing :) I am on vacations but will slowly have a look to what @squito suggests ;) |
Thank you, @skonto ! |
Although there was a request, but this is a kind of nice-to-have. I mean this is not a blocker because this is not a regression from 2.4.3. We don't need to hurry. Since Apache Spark 2.4.4 RC1 is scheduled on Monday, we can revisit this later. |
I'm +1 on turning this on for default for Spark 3, I think killing the JVM on OOM is the most reasonable option. I think having this be a global config between cluster managers not just for k8s as suggested by @SQUiTTO is a good idea. If there was a way to also write to the default termination log I think that would be a useful follow on PR possible. |
Kubernetes integration test starting |
Kubernetes integration test status failure |
thanks @holdenk @dongjoon-hyun I will target Spark 3 and try to implement this for all managers. |
Sounds good :) Please ping me when you open that PR (or if you update this PR to support all of the cluster managers). |
back to this. |
Cool, excited to see the progress. |
closing this in favor of #26161 |
What changes were proposed in this pull request?
Note: This follows the discussion here: [SPARK-27900][CORE] Add uncaught exception handler to the driver #24796. Without this pods on K8s will keep running although Spark has failed. In addition current behavior creates a problem to the Spark Operator and any other operator as it cannot detect failure at the K8s level.
How was this patch tested?
Manually by launching SparkPi with a large number
100000000
which leads to an oom due to the large number of tasks allocated.