-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-5979][SPARK-6031][SPARK-6032][SPARK-6047] Refactoring for --packages -> Move to SparkSubmitDriverBootstrapper #4754
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Test build #27927 has finished for PR 4754 at commit
|
@pwendell Please take a look at this. I think you reviewed the original PR of this feature. |
LGTM |
@tdas added a hack to include the jars on --driver-extra-classpath. Can you try your test now? |
Test build #27942 has finished for PR 4754 at commit
|
Test build #27943 has finished for PR 4754 at commit
|
I tested. Still not working. I enabled verbose logging on spark-submit and saw this
So i can see that the relevant jars are being added to the classpath elements but pyspark is still unable to find org.apache.spark.streaming.kafka.KafkaUtils (from /Users/tdas/.ivy2/jars/spark-streaming-kafka_2.10.jar). Lets debug this tomorrow morning. |
No I verified the class does exist in jar
|
Test build #27963 has finished for PR 4754 at commit
|
@tdas @pwendell @andrewor14 |
Test build #27973 has finished for PR 4754 at commit
|
Jenkins, test this again. |
It might not be a flaky test. I might have broken some Yarn feature. I'm
|
Ohh... okay. |
Test build #622 has finished for PR 4754 at commit
|
@tdas The latest commit fixed the issue, feel free to test |
Test build #27989 has finished for PR 4754 at commit
|
Test build #27990 has finished for PR 4754 at commit
|
Test build #28007 has finished for PR 4754 at commit
|
This passed locally. What the...
|
retest this please |
Test build #28011 has finished for PR 4754 at commit
|
This reverts commit b7a9e93.
Test build #28015 has finished for PR 4754 at commit
|
Test build #28019 has finished for PR 4754 at commit
|
Flaky test this time... @tdas, can you have this retested please? |
Jenkins, retest this please |
Test build #28023 has finished for PR 4754 at commit
|
@srowen Thank you! |
@brkyvz I think you need to address a couple of more JIRAs in this PR. 4 aint enough ;) |
newClasspath += sys.props("path.separator") + | ||
resolvedMavenCoordinates.mkString(sys.props("path.separator")) | ||
submitArgs = | ||
Array("--packages-resolved", resolvedMavenCoordinates.mkString(",")) ++ submitArgs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we thread this through using an environment variable _PACKAGES_RESOLVED
? Having this as an extra flag forces you to make args
here mutable, which is sort of strange.
pwendell tdas This is the safer parts of PR #4754: - SPARK-5979: All dependencies with the groupId `org.apache.spark` passed through `--packages`, were being excluded from the dependency tree on the assumption that they would be in the assembly jar. This is not the case, therefore the exclusion rules had to be defined more explicitly. - SPARK-6032: Ivy prints a whole lot of logs while retrieving dependencies. These were printed to `System.out`. Moved the logging to `System.err`. Author: Burak Yavuz <brkyvz@gmail.com> Closes #4802 from brkyvz/simple-streaming-fix and squashes the following commits: e0f38cb [Burak Yavuz] Merge branch 'master' of github.com:apache/spark into simple-streaming-fix bad921c [Burak Yavuz] [SPARK-5979][SPARK-6032] Smaller safer fix (cherry picked from commit 6d8e5fb) Signed-off-by: Patrick Wendell <patrick@databricks.com>
pwendell tdas This is the safer parts of PR #4754: - SPARK-5979: All dependencies with the groupId `org.apache.spark` passed through `--packages`, were being excluded from the dependency tree on the assumption that they would be in the assembly jar. This is not the case, therefore the exclusion rules had to be defined more explicitly. - SPARK-6032: Ivy prints a whole lot of logs while retrieving dependencies. These were printed to `System.out`. Moved the logging to `System.err`. Author: Burak Yavuz <brkyvz@gmail.com> Closes #4802 from brkyvz/simple-streaming-fix and squashes the following commits: e0f38cb [Burak Yavuz] Merge branch 'master' of github.com:apache/spark into simple-streaming-fix bad921c [Burak Yavuz] [SPARK-5979][SPARK-6032] Smaller safer fix
@brkyvz let's close this issue for now and keep it in our back pocket. We can use it if we decide to put this in the 1.3 branch down the line. |
This PR is an umbrella PR for 3 JIRAs. Here're the explanations:
org.apache.spark
passed through--packages
, were being excluded from the dependency tree on the assumption that they would be in the assembly jar. This is not the case, therefore the exclusion rules had to be defined more explicitly.--packages
to SparkSubmitDriverBootstrapper solves this. However, this issue still remains for--jars
.System.out
. Moved the logging toSystem.err
.@tdas Would you care to try this? I think it should solve your problem