Skip to content

[SPARK-15402] [ML] [PySpark] PySpark ml.evaluation should support save/load #13194

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

yanboliang
Copy link
Contributor

What changes were proposed in this pull request?

Since ml.evaluation has supported save/load at Scala side, supporting it at Python side is very straightforward and easy.

How was this patch tested?

Add python doctest.

@SparkQA
Copy link

SparkQA commented May 19, 2016

Test build #58860 has finished for PR 13194 at commit 57a29c3.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class BinaryClassificationEvaluator(JavaEvaluator, HasLabelCol, HasRawPredictionCol,
    • class RegressionEvaluator(JavaEvaluator, HasLabelCol, HasPredictionCol,
    • class MulticlassClassificationEvaluator(JavaEvaluator, HasLabelCol, HasPredictionCol,

@SparkQA
Copy link

SparkQA commented May 19, 2016

Test build #58864 has finished for PR 13194 at commit d3c4538.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yanboliang
Copy link
Contributor Author

cc @jkbradley @mengxr

from pyspark.context import SparkContext
from pyspark.sql import SQLContext
globs = globals().copy()
globs = pyspark.ml.evaluation.__dict__.copy()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain why you need this change (and add a comment about it)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thunterdb If we did not make this change, we can work well when using ./bin/pyspark to run all tests, but it failed when run with ./python/run-tests --modules=pyspark-ml.This issue is weird, I still not figure out the root cause, and we did the same changes when we add other save/load.

@SparkQA
Copy link

SparkQA commented May 24, 2016

Test build #59204 has finished for PR 13194 at commit cd0c9d7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@holdenk
Copy link
Contributor

holdenk commented Oct 7, 2016

This seems pretty reasonable, do you have the time to update it to master and then maybe we could get @MLnick to take a quick look?

@SparkQA
Copy link

SparkQA commented Oct 10, 2016

Test build #66665 has finished for PR 13194 at commit 8764e83.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yanboliang
Copy link
Contributor Author

Resolved merge conflicts, ping @holdenk @MLnick @srowen @jkbradley to take a look when you available. This is just add Python API and should be very straight forward to move ahead. Thanks.

@@ -21,7 +21,8 @@
from pyspark.ml.wrapper import JavaParams
from pyspark.ml.param import Param, Params, TypeConverters
from pyspark.ml.param.shared import HasLabelCol, HasPredictionCol, HasRawPredictionCol
from pyspark.ml.common import inherit_doc
from pyspark.ml.util import JavaMLReadable, JavaMLWritable
from pyspark.mllib.common import inherit_doc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering why change to import from mllib.common?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, typo.

@BryanCutler
Copy link
Member

Just one question, otherwise LGTM

globs=globs, optionflags=doctest.ELLIPSIS)
spark.stop()
if failure_count:
exit(-1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should there still be an exit(-1) when failures? If not then you can remove the returned variables from doctest.testmod

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for remind, added check failure_count.

@SparkQA
Copy link

SparkQA commented Oct 12, 2016

Test build #66827 has finished for PR 13194 at commit cc85a92.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@holdenk
Copy link
Contributor

holdenk commented Oct 12, 2016

ping @MLnick & @davies for final review if they have bandwidth around this?

@yanboliang
Copy link
Contributor Author

Since this is very straight forward, I'll merge it into master. Any comments are still appreciated, and I can address them in follow up work. Thanks for review! @holdenk @BryanCutler

@asfgit asfgit closed this in 1db8fea Oct 14, 2016
@yanboliang yanboliang deleted the spark-15402 branch October 14, 2016 11:19
robert3005 pushed a commit to palantir/spark that referenced this pull request Nov 1, 2016
…load

## What changes were proposed in this pull request?
Since ```ml.evaluation``` has supported save/load at Scala side, supporting it at Python side is very straightforward and easy.

## How was this patch tested?
Add python doctest.

Author: Yanbo Liang <ybliang8@gmail.com>

Closes apache#13194 from yanboliang/spark-15402.
uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
…load

## What changes were proposed in this pull request?
Since ```ml.evaluation``` has supported save/load at Scala side, supporting it at Python side is very straightforward and easy.

## How was this patch tested?
Add python doctest.

Author: Yanbo Liang <ybliang8@gmail.com>

Closes apache#13194 from yanboliang/spark-15402.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants