-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-8687][YARN]Fix bug: Executor can't fetch the new set configuration in yarn-client #7066
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Merged build triggered. |
|
Merged build started. |
|
Test build #35931 has started for PR 7066 at commit |
|
Test build #35931 has finished for PR 7066 at commit
|
|
Merged build finished. Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is only used in 1 place. I would just inline it in L176.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after you move it, I would add a comment that says:
// Retrieve fresh properties from SparkConf again in case a new property is set
// This is necessary for passing certain properties to executors in YARN (SPARK-8687)
|
@SaintBacchus I looked at this quickly and I think this approach is reasonable. This seems to only affect YARN client mode where the My only concern with this approach is that different executors may get different configurations. It does potentially introduce an unlikely race condition, where executors get registered before we set the property. |
|
Actually, have you considered the following alternative? We modify This guarantees that by the time we call |
|
Merged build triggered. |
|
Merged build started. |
|
Test build #36078 has started for PR 7066 at commit |
@andrewor14 This modify is much more suitable for this problem. But if user had to set configuration in other deploy mode, they had to be cautious about this problem. |
|
In general I think we discourage the developers to set additional variables in the I think the reordering is the simpler approach here, otherwise we'll have to worry about executors potentially getting different properties, which is much more confusing. |
|
Test build #36078 has finished for PR 7066 at commit
|
|
Merged build finished. Test PASSed. |
|
Merging into master. Thanks @SaintBacchus! |
…ration in yarn-client
Spark initi the properties CoarseGrainedSchedulerBackend.start
```scala
// TODO (prashant) send conf instead of properties
driverEndpoint = rpcEnv.setupEndpoint(
CoarseGrainedSchedulerBackend.ENDPOINT_NAME, new DriverEndpoint(rpcEnv, properties))
```
Then the yarn logic will set some configuration but not update in this `properties`.
So `Executor` won't gain the `properties`.
[Jira](https://issues.apache.org/jira/browse/SPARK-8687)
Author: huangzhaowei <carlmartinmax@gmail.com>
Closes #7066 from SaintBacchus/SPARK-8687 and squashes the following commits:
1de4f48 [huangzhaowei] Ensure all necessary properties have already been set before startup ExecutorLaucher
(cherry picked from commit 1b0c8e6)
Signed-off-by: Andrew Or <andrew@databricks.com>
Spark initi the properties CoarseGrainedSchedulerBackend.start
Then the yarn logic will set some configuration but not update in this
properties.So
Executorwon't gain theproperties.Jira