-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-8679] [PySpark] [MLlib] Default values in Pipeline API should be immutable #7058
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Merged build triggered. |
|
Merged build started. |
|
Test build #35896 has started for PR 7058 at commit |
|
Test build #35896 has finished for PR 7058 at commit
|
|
Merged build finished. Test FAILed. |
|
jenkins test this please |
|
Merged build triggered. |
|
Merged build started. |
|
Test build #35898 has started for PR 7058 at commit |
|
Test build #35898 has finished for PR 7058 at commit
|
|
Merged build finished. Test FAILed. |
|
LGTM. I used In the future, it might be nice to enable a lint check for this as part of the automated build checks, but lets do that separately (I've filed https://issues.apache.org/jira/browse/SPARK-8706 for this). @mengxr, I'll let you merge this. |
|
Merged build triggered. |
|
Merged build started. |
|
I pushed one more instance that I came across today. Btw, what's with the comment showing test failed but the build showing test passed. |
|
Test build #36003 has started for PR 7058 at commit |
|
The spurious failure message from AmplabJenkins is a side-effect of some changes which skip the JVM tests for PRs that only touch certain PySpark modules. Ignore the status messages from AmplabJenkins and pay attention to SparkQA instead. I'm trying to fix this via the Jenkins configuration. |
|
Thanks for the clarification! |
|
Test build #36003 has finished for PR 7058 at commit
|
|
Merged build finished. Test FAILed. |
|
jenkins test this please |
|
The PySpark ML tests passed, so this looks fine to me. |
|
Jenkins, retest this please. |
|
Merged build triggered. |
|
Merged build started. |
|
I'm testing out a slightly silly hack to see if I can prevent the spurious AmplabJenkins errors: at the bottom of our Jenkins "execute shell" command, I'm adding an extra call to |
|
Test build #36005 has started for PR 7058 at commit |
|
Test build #36005 has finished for PR 7058 at commit
|
|
Merged build finished. Test PASSed. |
|
Woohoo, looks like my hack worked :) |
|
A slightly related concern is mutation of non-default arguments. There might be cases where we need to defensively copy the input arguments, but I'm not familiar enough with MLlib to know where they might occur. |
|
Hmm. Scikit-learn has a |
|
Jenkins, retest this please. |
|
(Re-testing to test the new PySpark test parallelization) |
|
Merged build triggered. |
|
Merged build started. |
|
Test build #36097 has started for PR 7058 at commit |
|
Test build #36097 has finished for PR 7058 at commit
|
|
Merged build finished. Test PASSed. |
…be immutable It might be dangerous to have a mutable as value for default param. (http://stackoverflow.com/a/11416002/1170730) e.g def func(example, f={}): f[example] = 1 return f func(2) {2: 1} func(3) {2:1, 3:1} mengxr Author: MechCoder <manojkumarsivaraj334@gmail.com> Closes #7058 from MechCoder/pipeline_api_playground and squashes the following commits: 40a5eb2 [MechCoder] copy 95f7ff2 [MechCoder] [SPARK-8679] [PySpark] [MLlib] Default values in Pipeline API should be immutable (cherry picked from commit 5fa0863) Signed-off-by: Xiangrui Meng <meng@databricks.com>
|
Merged into master and branch-1.4. Thanks! |
|
@MechCoder The JIRA should be 8678 instead of 8679. Please check the JIRA number next time:) |
|
alright, thanks! :) |
It might be dangerous to have a mutable as value for default param. (http://stackoverflow.com/a/11416002/1170730)
e.g
@mengxr