Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-48392][CORE] Also load spark-defaults.conf when provided --properties-file #46709

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -125,14 +125,28 @@ private[deploy] class SparkSubmitArguments(args: Seq[String], env: Map[String, S
* When this is called, `sparkProperties` is already filled with configs from the latter.
*/
private def mergeDefaultSparkProperties(): Unit = {
// Use common defaults file, if not specified by user
propertiesFile = Option(propertiesFile).getOrElse(Utils.getDefaultPropertiesFile(env))
// Honor --conf before the defaults file
// Honor --conf before the specified properties file and defaults file
defaultSparkProperties.foreach { case (k, v) =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when running with --verbose, the defaultSparkProperties evaluation will logging: "Using properties file: null" if --properties-file is not provided as propertiesFile is not set yet., which I think it's not expected.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops that's right. I'll try to address it in the follow-up PR.

if (!sparkProperties.contains(k)) {
sparkProperties(k) = v
}
}

// Also load properties from `spark-defaults.conf` if they do not exist in the properties file
sunchao marked this conversation as resolved.
Show resolved Hide resolved
// and --conf list
val defaultSparkConf = Utils.getDefaultPropertiesFile(env)
Option(defaultSparkConf).foreach { filename =>
val properties = Utils.getPropertiesFromFile(filename)
properties.foreach { case (k, v) =>
if (!sparkProperties.contains(k)) {
sparkProperties(k) = v
}
}
}

if (propertiesFile == null) {
propertiesFile = defaultSparkConf
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mostly to make the test happy

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this follows original behavior to overwrite propertiesFile value. Although it looks like a strange behavior.

}
}

/**
Expand Down
80 changes: 52 additions & 28 deletions core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala
Original file line number Diff line number Diff line change
Expand Up @@ -1113,6 +1113,23 @@ class SparkSubmitSuite
}
}

test("SPARK-48392: Allow both spark-defaults.conf and properties file") {
forConfDir(Map("spark.executor.memory" -> "3g")) { path =>
withPropertyFile("spark-conf.properties", Map("spark.executor.cores" -> "16")) { propsFile =>
val unusedJar = TestUtils.createJarWithClasses(Seq.empty)
val args = Seq(
"--class", SimpleApplicationTest.getClass.getName.stripSuffix("$"),
"--name", "testApp",
"--master", "local",
"--properties-file", propsFile,
unusedJar.toString)
val appArgs = new SparkSubmitArguments(args, env = Map("SPARK_CONF_DIR" -> path))
appArgs.executorMemory should be("3g")
appArgs.executorCores should be("16")
}
}
}

test("support glob path") {
withTempDir { tmpJarDir =>
withTempDir { tmpFileDir =>
Expand Down Expand Up @@ -1623,6 +1640,22 @@ class SparkSubmitSuite
}
}

private def withPropertyFile(fileName: String, conf: Map[String, String])(f: String => Unit) = {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extract some common logic of creating a property file into a util function.

withTempDir { tmpDir =>
val props = new java.util.Properties()
val propsFile = File.createTempFile(fileName, "", tmpDir)
val propsOutputStream = new FileOutputStream(propsFile)
try {
conf.foreach { case (k, v) => props.put(k, v) }
props.store(propsOutputStream, "")
} finally {
propsOutputStream.close()
}

f(propsFile.getPath)
}
}

private def updateConfWithFakeS3Fs(conf: Configuration): Unit = {
conf.set("fs.s3a.impl", classOf[TestFileSystem].getCanonicalName)
conf.set("fs.s3a.impl.disable.cache", "true")
Expand Down Expand Up @@ -1694,40 +1727,31 @@ class SparkSubmitSuite
val infixDelimFromFile = s"${delimKey}infixDelimFromFile" -> s"${CR}blah${LF}"
val nonDelimSpaceFromFile = s"${delimKey}nonDelimSpaceFromFile" -> " blah\f"

val testProps = Seq(leadingDelimKeyFromFile, trailingDelimKeyFromFile, infixDelimFromFile,
val testProps = Map(leadingDelimKeyFromFile, trailingDelimKeyFromFile, infixDelimFromFile,
nonDelimSpaceFromFile)

val props = new java.util.Properties()
val propsFile = File.createTempFile("test-spark-conf", ".properties",
Utils.createTempDir())
val propsOutputStream = new FileOutputStream(propsFile)
try {
testProps.foreach { case (k, v) => props.put(k, v) }
props.store(propsOutputStream, "test whitespace")
} finally {
propsOutputStream.close()
}
withPropertyFile("test-spark-conf.properties", testProps) { propsFile =>
val clArgs = Seq(
"--class", "org.SomeClass",
"--conf", s"${lineFeedFromCommandLine._1}=${lineFeedFromCommandLine._2}",
"--conf", "spark.master=yarn",
"--properties-file", propsFile,
"thejar.jar")

val clArgs = Seq(
"--class", "org.SomeClass",
"--conf", s"${lineFeedFromCommandLine._1}=${lineFeedFromCommandLine._2}",
"--conf", "spark.master=yarn",
"--properties-file", propsFile.getPath,
"thejar.jar")
val appArgs = new SparkSubmitArguments(clArgs)
val (_, _, conf, _) = submit.prepareSubmitEnvironment(appArgs)

val appArgs = new SparkSubmitArguments(clArgs)
val (_, _, conf, _) = submit.prepareSubmitEnvironment(appArgs)
Seq(
lineFeedFromCommandLine,
leadingDelimKeyFromFile,
trailingDelimKeyFromFile,
infixDelimFromFile
).foreach { case (k, v) =>
conf.get(k) should be (v)
}

Seq(
lineFeedFromCommandLine,
leadingDelimKeyFromFile,
trailingDelimKeyFromFile,
infixDelimFromFile
).foreach { case (k, v) =>
conf.get(k) should be (v)
conf.get(nonDelimSpaceFromFile._1) should be ("blah")
}

conf.get(nonDelimSpaceFromFile._1) should be ("blah")
}

test("get a Spark configuration from arguments") {
Expand Down
11 changes: 7 additions & 4 deletions docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,12 +111,15 @@ each line consists of a key and a value separated by whitespace. For example:
spark.eventLog.enabled true
spark.serializer org.apache.spark.serializer.KryoSerializer

In addition, a property file with Spark configurations can be passed to `bin/spark-submit` via
the `--properties-file` parameter.

Any values specified as flags or in the properties file will be passed on to the application
and merged with those specified through SparkConf. Properties set directly on the SparkConf
take highest precedence, then flags passed to `spark-submit` or `spark-shell`, then options
in the `spark-defaults.conf` file. A few configuration keys have been renamed since earlier
versions of Spark; in such cases, the older key names are still accepted, but take lower
precedence than any instance of the newer key.
take the highest precedence, then those through `--conf` flags or `--properties-file` passed to
`spark-submit` or `spark-shell`, then options in the `spark-defaults.conf` file. A few
configuration keys have been renamed since earlier versions of Spark; in such cases, the older
key names are still accepted, but take lower precedence than any instance of the newer key.

Spark properties mainly can be divided into two kinds: one is related to deploy, like
"spark.driver.memory", "spark.executor.instances", this kind of properties may not be affected when
Expand Down