Skip to content

[SPARK-4990][Deploy]to find default properties file, search SPARK_CONF_DIR first #3823

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from

Conversation

WangTaoTheTonic
Copy link
Contributor

@WangTaoTheTonic WangTaoTheTonic changed the title [SPARK-4990]to find default properties file, search SPARK_CONF_DIR first [SPARK-4990][Deploy]to find default properties file, search SPARK_CONF_DIR first Dec 29, 2014
@SparkQA
Copy link

SparkQA commented Dec 29, 2014

Test build #24857 has started for PR 3823 at commit c5a85eb.

  • This patch merges cleanly.

@@ -42,7 +42,10 @@ while (($#)); do
shift
done

DEFAULT_PROPERTIES_FILE="$SPARK_HOME/conf/spark-defaults.conf"
DEFAULT_PROPERTIES_FILE="$SPARK_CONF_DIR/spark-defaults.conf"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SparkSubmitArguments already ultimately handles this case, right? What does this fix?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used for

if [[ "$SPARK_SUBMIT_DEPLOY_MODE" == "client" && -f "$SPARK_SUBMIT_PROPERTIES_FILE" ]]; then

Parse the properties file only if the special configs exist

contains_special_configs=$(
grep -e "spark.driver.extra_|spark.driver.memory" "$SPARK_SUBMIT_PROPERTIES_FILE" |
grep -v "^[[:space:]]_#"
)
if [ -n "$contains_special_configs" ]; then
export SPARK_SUBMIT_BOOTSTRAP_DRIVER=1
fi
fi

@SparkQA
Copy link

SparkQA commented Dec 29, 2014

Test build #24857 has finished for PR 3823 at commit c5a85eb.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24857/
Test PASSed.

@OopsOutOfMemory
Copy link
Contributor

Hi, @WangTaoTheTonic : )
This make sense to me, like hadoop also have HADOOP_CONF_DIR.

But I prefer to check if the SPARK_CONF_DIR directory is exists first, not only check the configuration file.
If there are many files under SPARK_CONF_DIR need to be added in spark-submit in the future, you will need to check each file is exists or not.

you can do it like :

if [ ! -d "$SPARK_CONF_DIR" ]; then

@WangTaoTheTonic
Copy link
Contributor Author

@OopsOutOfMemory Thanks for your comment and I understand your concern.
Actually it doesn't matter that if SPARK_CONF_DIR does not exist here because we can use $SPARK_HOME/conf instead. And checking logic of properties file contains that of SPARK_CONF_DIR.
In other places of spark codes, we usually use a getOrElse logic to handle configuration.
It is easy to analyse when use got some specific config wrong and we'd better not to broke this tradition. :)

@OopsOutOfMemory
Copy link
Contributor

Sorry, maybe here is a misunderstanding.
What I mean is to change the checking logic of properties file instead of __checking whether the SPARK_CONF_DIR isuser-specifcordefault` . But not to add an extra checking directory here .
Let me raise an example:

if [ ! -d "$SPARK_CONF_DIR" ]; then
   export SPARK_CONF_DIR = $SPARK_HOME/conf
fi
DEFAULT_PROPERTIES_FILE="$SPARK_CONF_DIR/spark-defaults.conf"
XXX_PROPERTIES_FILE = "$SPARK_CONF_DIR/xxx.conf

To check conf directory is more reasonable, because the key point is the SPARK_CONF_DIR. The original concern here is to change the path of SPARK_CONF_DIR but not only spark-default.conf.

analyse when use got some specific config wrong

We may add an extra warning for the key configuration file here. i.e If spark-deault.conf is missing or changed to a user-specific dir under conf_dir, we may raise an warning log to let user aware of this before submitting.

Currently, I think both of the two solutions is ok! : )

@WangTaoTheTonic
Copy link
Contributor Author

@OopsOutOfMemory Ok I got what you mean. After checking the logic in SparkSubmitArguments.scala I do think that your solution is more reasonable. Thanks.

 val sparkHomeConfig = env.get("SPARK_HOME").map(sparkHome => s"${sparkHome}${sep}conf")
  val confDir = env.get("SPARK_CONF_DIR").orElse(sparkHomeConfig)

@SparkQA
Copy link

SparkQA commented Dec 30, 2014

Test build #24875 has started for PR 3823 at commit 07b9ebf.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Dec 30, 2014

Test build #24877 has started for PR 3823 at commit d8d3cb7.

  • This patch merges cleanly.

@@ -42,7 +42,10 @@ while (($#)); do
shift
done

DEFAULT_PROPERTIES_FILE="$SPARK_HOME/conf/spark-defaults.conf"
if [ ! -d "$SPARK_CONF_DIR" ]; then
SPARK_CONF_DIR="$SPARK_HOME/conf"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend to add export keyword to make the SPARK_CONF_DIR global :)

 export SPARK_CONF_DIR="$SPARK_HOME/conf"

Just like SPARK_HOME, it's a global variable.

@SparkQA
Copy link

SparkQA commented Dec 30, 2014

Test build #24875 has finished for PR 3823 at commit 07b9ebf.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24875/
Test FAILed.

@SparkQA
Copy link

SparkQA commented Dec 30, 2014

Test build #24877 has finished for PR 3823 at commit d8d3cb7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24877/
Test PASSed.

@SparkQA
Copy link

SparkQA commented Dec 30, 2014

Test build #24883 has started for PR 3823 at commit 55300bc.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Dec 30, 2014

Test build #24883 has finished for PR 3823 at commit 55300bc.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24883/
Test PASSed.

@andrewor14
Copy link
Contributor

LGTM, but this seems to have a conflict from your other patch now. Could you rebase to master?

@WangTaoTheTonic
Copy link
Contributor Author

@andrewor14 Ok, rebase done.

@SparkQA
Copy link

SparkQA commented Jan 9, 2015

Test build #25286 has started for PR 3823 at commit 4cc7f34.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Jan 9, 2015

Test build #25286 has finished for PR 3823 at commit 4cc7f34.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25286/
Test PASSed.

@@ -44,7 +44,10 @@ while (($#)); do
shift
done

DEFAULT_PROPERTIES_FILE="$SPARK_HOME/conf/spark-defaults.conf"
if [ ! -d "$SPARK_CONF_DIR" ]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, maybe this should be "if not defined" instead of "if not exists". Otherwise if the user sets SPARK_CONF_DIR to a directory that doesn't exist then this will silently use a different config.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e.

if [ -z "$SPARK_CONF_DIR" ]; then
  export SPARK_CONF_DIR="$SPARK_HOME/conf"
fi

@SparkQA
Copy link

SparkQA commented Jan 10, 2015

Test build #25347 has started for PR 3823 at commit b1ab402.

  • This patch merges cleanly.

@WangTaoTheTonic
Copy link
Contributor Author

@andrewor14 Ok, that makes sense. I changed not exist to not defined in windows cmd too.

@SparkQA
Copy link

SparkQA commented Jan 10, 2015

Test build #25348 has started for PR 3823 at commit 133c43e.

  • This patch merges cleanly.

@andrewor14
Copy link
Contributor

Ok I'm merging this into master since tests are irrelevant here thanks.

@asfgit asfgit closed this in 8782eb9 Jan 10, 2015
@SparkQA
Copy link

SparkQA commented Jan 10, 2015

Test build #25347 has finished for PR 3823 at commit b1ab402.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25347/
Test PASSed.

@SparkQA
Copy link

SparkQA commented Jan 10, 2015

Test build #25348 has finished for PR 3823 at commit 133c43e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25348/
Test PASSed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants