[SPARK-28294][CORE] Support `spark.history.fs.cleaner.maxNum` configuration #25072

dongjoon-hyun · 2019-07-08T07:03:16Z

What changes were proposed in this pull request?

Up to now, Apache Spark maintains the given event log directory by time policy, spark.history.fs.cleaner.maxAge. However, there are two issues.

Some file system has a limitation on the maximum number of files in a single directory. For example, HDFS dfs.namenode.fs-limits.max-directory-items is 1024 * 1024 by default.
https://hadoop.apache.org/docs/r3.2.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
Spark is sometimes unable to to clean up some old log files due to permission issues (mainly, security policy).

To handle both (1) and (2), this PR aims to support an additional policy configuration for the maximum number of files in the event log directory, spark.history.fs.cleaner.maxNum. Spark will try to keep the number of files in the event log directory according to this policy.

How was this patch tested?

Pass the Jenkins with a newly added test case.

…ration

SparkQA · 2019-07-08T09:08:02Z

Test build #107329 has finished for PR 25072 at commit 59277d6.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala

core/src/main/scala/org/apache/spark/internal/config/History.scala

dongjoon-hyun · 2019-07-08T18:19:15Z

Thank you for review, @srowen ! I'll update soon.

dongjoon-hyun · 2019-07-08T19:00:28Z

Could you review once more? I addressed some of the comments. For the followings, please let me know your opinion if I missed the point. I can update more.

val oldAttempts = listing.view(classOf[ApplicationInfoWrapper]) ([SPARK-28294][CORE] Support spark.history.fs.cleaner.maxNum configuration #25072 (comment))
var deleted = false ([SPARK-28294][CORE] Support spark.history.fs.cleaner.maxNum configuration #25072 (comment))

srowen

Seems reasonable to me. I usually dislike yet another config but I can see the need for this. This default should virtually never affect any cases that work today, right? like if someone had more than a million files it's already probably causing problems? just wanting to make sure this doesn't surprise anyone as a behavior change if possible.

core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala

docs/monitoring.md

dongjoon-hyun · 2019-07-08T19:46:28Z

For the default value, I agree with your concerns.

Although this default value, 1M, is big enough not to surprise general HDFS-default users. But, there might be two exceptions.

If they increase dfs.namenode.fs-limits.max-directory-items configuration already, the maximum is 6 * 1024 * 1024.
If they are using non-HDFS storage, maybe S3?

So, not to surprise all users including S3, do you want me to use Int.MaxValue instead? I can change like that. Technically, that will disable this feature, but spark.history.fs.cleaner.enabled itself is disabled by default, too.

dongjoon-hyun · 2019-07-08T19:53:20Z

The default value is increased to Int.MaxValue, too.

SparkQA · 2019-07-08T21:52:53Z

Test build #107369 has finished for PR 25072 at commit f2c93a0.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-07-08T21:59:58Z

Test build #107368 has finished for PR 25072 at commit 3444b73.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

dongjoon-hyun · 2019-07-09T15:40:41Z

Hi, @srowen .
Could you review this once more please?

srowen

LGTM. I wonder if @vanzin has any thoughts as he wrote the maxAge cleanup bit.

srowen

(Meant to approve)

dongjoon-hyun · 2019-07-09T17:14:58Z

Thank you so much for review and approval, @srowen !
Yes. I'll wait for @vanzin 's comment for one day.

Hi, @vanzin . Could you take a look at this if you have some time?

dongjoon-hyun · 2019-07-09T22:22:46Z

Retest this please.

SparkQA · 2019-07-10T00:51:11Z

Test build #107417 has finished for PR 25072 at commit f2c93a0.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

dongjoon-hyun · 2019-07-10T14:19:14Z

Merged to master. Thank you, @srowen .

[SPARK-28294][CORE] Support spark.history.fs.cleaner.maxNum configu…

59277d6

…ration

dongjoon-hyun added the SPARK CORE label Jul 8, 2019

srowen requested changes Jul 8, 2019

View reviewed changes

Address comments

84ed42b

srowen reviewed Jul 8, 2019

View reviewed changes

core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala Outdated Show resolved Hide resolved

docs/monitoring.md Outdated Show resolved Hide resolved

Address comments

3444b73

Use Int.MaxValue by default

1264a8d

dongjoon-hyun added 2 commits July 8, 2019 12:54

Remove .toList

9cdc9cc

fix typo

f2c93a0

This comment has been minimized.

Sign in to view

srowen requested changes Jul 9, 2019

View reviewed changes

srowen approved these changes Jul 9, 2019

View reviewed changes

dongjoon-hyun closed this in bbc2be4 Jul 10, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-28294][CORE] Support `spark.history.fs.cleaner.maxNum` configuration #25072

[SPARK-28294][CORE] Support `spark.history.fs.cleaner.maxNum` configuration #25072

dongjoon-hyun commented Jul 8, 2019

SparkQA commented Jul 8, 2019

dongjoon-hyun commented Jul 8, 2019

dongjoon-hyun commented Jul 8, 2019

srowen left a comment

dongjoon-hyun commented Jul 8, 2019 •

edited

Loading

dongjoon-hyun commented Jul 8, 2019

This comment has been minimized.

SparkQA commented Jul 8, 2019

SparkQA commented Jul 8, 2019

dongjoon-hyun commented Jul 9, 2019

srowen left a comment

srowen left a comment

dongjoon-hyun commented Jul 9, 2019

dongjoon-hyun commented Jul 9, 2019

SparkQA commented Jul 10, 2019

dongjoon-hyun commented Jul 10, 2019

[SPARK-28294][CORE] Support spark.history.fs.cleaner.maxNum configuration #25072

[SPARK-28294][CORE] Support spark.history.fs.cleaner.maxNum configuration #25072

Conversation

dongjoon-hyun commented Jul 8, 2019

What changes were proposed in this pull request?

How was this patch tested?

SparkQA commented Jul 8, 2019

dongjoon-hyun commented Jul 8, 2019

dongjoon-hyun commented Jul 8, 2019

srowen left a comment

Choose a reason for hiding this comment

dongjoon-hyun commented Jul 8, 2019 • edited Loading

dongjoon-hyun commented Jul 8, 2019

This comment has been minimized.

SparkQA commented Jul 8, 2019

SparkQA commented Jul 8, 2019

dongjoon-hyun commented Jul 9, 2019

srowen left a comment

Choose a reason for hiding this comment

srowen left a comment

Choose a reason for hiding this comment

dongjoon-hyun commented Jul 9, 2019

dongjoon-hyun commented Jul 9, 2019

SparkQA commented Jul 10, 2019

dongjoon-hyun commented Jul 10, 2019

[SPARK-28294][CORE] Support `spark.history.fs.cleaner.maxNum` configuration #25072

[SPARK-28294][CORE] Support `spark.history.fs.cleaner.maxNum` configuration #25072

dongjoon-hyun commented Jul 8, 2019 •

edited

Loading