Skip to content

Conversation

@MechCoder
Copy link
Contributor

It might be dangerous to have a mutable as value for default param. (http://stackoverflow.com/a/11416002/1170730)

e.g

def func(example, f={}):
    f[example] = 1
    return f

func(2)

{2: 1}
func(3)
{2:1, 3:1}

@mengxr

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented Jun 27, 2015

Test build #35896 has started for PR 7058 at commit 95f7ff2.

@SparkQA
Copy link

SparkQA commented Jun 27, 2015

Test build #35896 has finished for PR 7058 at commit 95f7ff2.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@MechCoder
Copy link
Contributor Author

jenkins test this please

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented Jun 27, 2015

Test build #35898 has started for PR 7058 at commit 95f7ff2.

@SparkQA
Copy link

SparkQA commented Jun 27, 2015

Test build #35898 has finished for PR 7058 at commit 95f7ff2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@JoshRosen
Copy link
Contributor

LGTM. I used pylint / prospector to lint this file and it looks like this has fixed all occurrences of the "dangerous-default-value" pattern in this file.

In the future, it might be nice to enable a lint check for this as part of the automated build checks, but lets do that separately (I've filed https://issues.apache.org/jira/browse/SPARK-8706 for this).

@mengxr, I'll let you merge this.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@MechCoder
Copy link
Contributor Author

I pushed one more instance that I came across today. Btw, what's with the comment showing test failed but the build showing test passed.

@SparkQA
Copy link

SparkQA commented Jun 29, 2015

Test build #36003 has started for PR 7058 at commit 40a5eb2.

@JoshRosen
Copy link
Contributor

The spurious failure message from AmplabJenkins is a side-effect of some changes which skip the JVM tests for PRs that only touch certain PySpark modules. Ignore the status messages from AmplabJenkins and pay attention to SparkQA instead. I'm trying to fix this via the Jenkins configuration.

@MechCoder
Copy link
Contributor Author

Thanks for the clarification!

@SparkQA
Copy link

SparkQA commented Jun 29, 2015

Test build #36003 has finished for PR 7058 at commit 40a5eb2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@MechCoder
Copy link
Contributor Author

jenkins test this please

@JoshRosen
Copy link
Contributor

The PySpark ML tests passed, so this looks fine to me.

@JoshRosen
Copy link
Contributor

Jenkins, retest this please.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@JoshRosen
Copy link
Contributor

I'm testing out a slightly silly hack to see if I can prevent the spurious AmplabJenkins errors: at the bottom of our Jenkins "execute shell" command, I'm adding an extra call to build/sbt unsafe/test, which will run a Java test suite which only takes maybe 5 seconds to run. This will ensure that we always run at least one JVM-based test, which will create a JUnit XML test report; this should suppress the "could not find test reports" warning from the Jenkins JUnit plugin. Let's see if it works...

@SparkQA
Copy link

SparkQA commented Jun 29, 2015

Test build #36005 has started for PR 7058 at commit 40a5eb2.

@SparkQA
Copy link

SparkQA commented Jun 29, 2015

Test build #36005 has finished for PR 7058 at commit 40a5eb2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@JoshRosen
Copy link
Contributor

Woohoo, looks like my hack worked :)

@JoshRosen
Copy link
Contributor

A slightly related concern is mutation of non-default arguments. There might be cases where we need to defensively copy the input arguments, but I'm not familiar enough with MLlib to know where they might occur.

@MechCoder
Copy link
Contributor Author

Hmm. Scikit-learn has a copy argument in most of the model constructors to avoid this. Imagine a person feeding huge amounts of data to find all of them overwritten. But that's lesser of a concern here because the input data is mostly in the form of RDD's and DataFrames. We can always fix it when it is reported as broken :P

@JoshRosen
Copy link
Contributor

Jenkins, retest this please.

@JoshRosen
Copy link
Contributor

(Re-testing to test the new PySpark test parallelization)

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented Jun 30, 2015

Test build #36097 has started for PR 7058 at commit 40a5eb2.

@SparkQA
Copy link

SparkQA commented Jun 30, 2015

Test build #36097 has finished for PR 7058 at commit 40a5eb2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

asfgit pushed a commit that referenced this pull request Jun 30, 2015
…be immutable

It might be dangerous to have a mutable as value for default param. (http://stackoverflow.com/a/11416002/1170730)

e.g

    def func(example, f={}):
        f[example] = 1
        return f

    func(2)

    {2: 1}
    func(3)
    {2:1, 3:1}

mengxr

Author: MechCoder <manojkumarsivaraj334@gmail.com>

Closes #7058 from MechCoder/pipeline_api_playground and squashes the following commits:

40a5eb2 [MechCoder] copy
95f7ff2 [MechCoder] [SPARK-8679] [PySpark] [MLlib] Default values in Pipeline API should be immutable

(cherry picked from commit 5fa0863)
Signed-off-by: Xiangrui Meng <meng@databricks.com>
@mengxr
Copy link
Contributor

mengxr commented Jun 30, 2015

Merged into master and branch-1.4. Thanks!

@asfgit asfgit closed this in 5fa0863 Jun 30, 2015
@MechCoder MechCoder deleted the pipeline_api_playground branch June 30, 2015 17:28
@mengxr
Copy link
Contributor

mengxr commented Jun 30, 2015

@MechCoder The JIRA should be 8678 instead of 8679. Please check the JIRA number next time:)

@MechCoder
Copy link
Contributor Author

alright, thanks! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants