[SPARK-21915][ML][PySpark]Model 1 and Model 2 ParamMaps Missing #19126

marktab · 2017-09-05T03:28:17Z

Error in PySpark example code:
[https://github.com/apache/spark/blob/master/examples/src/main/python/ml/estimator_transformer_param_example.py]

The original Scala code says
println("Model 2 was fit using parameters: " + model2.parent.extractParamMap)

The parent is lr

There is no method for accessing parent as is done in Scala.

This code has been tested in Python, and returns values consistent with Scala

What changes were proposed in this pull request?

Proposing to call the lr variable instead of model1 or model2

How was this patch tested?

This patch was tested with Spark 2.1.0 comparing the Scala and PySpark results. Pyspark returns nothing at present for those two print lines.

The output for model2 in PySpark should be

{Param(parent='LogisticRegression_4187be538f744d5a9090', name='tol', doc='the convergence tolerance for iterative algorithms (>= 0).'): 1e-06,
Param(parent='LogisticRegression_4187be538f744d5a9090', name='elasticNetParam', doc='the ElasticNet mixing parameter, in range [0, 1]. For alpha = 0, the penalty is an L2 penalty. For alpha = 1, it is an L1 penalty.'): 0.0,
Param(parent='LogisticRegression_4187be538f744d5a9090', name='predictionCol', doc='prediction column name.'): 'prediction',
Param(parent='LogisticRegression_4187be538f744d5a9090', name='featuresCol', doc='features column name.'): 'features',
Param(parent='LogisticRegression_4187be538f744d5a9090', name='labelCol', doc='label column name.'): 'label',
Param(parent='LogisticRegression_4187be538f744d5a9090', name='probabilityCol', doc='Column name for predicted class conditional probabilities. Note: Not all models output well-calibrated probability estimates! These probabilities should be treated as confidences, not precise probabilities.'): 'myProbability',
Param(parent='LogisticRegression_4187be538f744d5a9090', name='rawPredictionCol', doc='raw prediction (a.k.a. confidence) column name.'): 'rawPrediction',
Param(parent='LogisticRegression_4187be538f744d5a9090', name='family', doc='The name of family which is a description of the label distribution to be used in the model. Supported options: auto, binomial, multinomial'): 'auto',
Param(parent='LogisticRegression_4187be538f744d5a9090', name='fitIntercept', doc='whether to fit an intercept term.'): True,
Param(parent='LogisticRegression_4187be538f744d5a9090', name='threshold', doc='Threshold in binary classification prediction, in range [0, 1]. If threshold and thresholds are both set, they must match.e.g. if threshold is p, then thresholds must be equal to [1-p, p].'): 0.55,
Param(parent='LogisticRegression_4187be538f744d5a9090', name='aggregationDepth', doc='suggested depth for treeAggregate (>= 2).'): 2,
Param(parent='LogisticRegression_4187be538f744d5a9090', name='maxIter', doc='max number of iterations (>= 0).'): 30,
Param(parent='LogisticRegression_4187be538f744d5a9090', name='regParam', doc='regularization parameter (>= 0).'): 0.1,
Param(parent='LogisticRegression_4187be538f744d5a9090', name='standardization', doc='whether to standardize the training features before fitting the model.'): True}

Please review http://spark.apache.org/contributing.html before opening a pull request.

The original Scala code says println("Model 2 was fit using parameters: " + model2.parent.extractParamMap) The parent is lr There is no method for accessing parent as is done in Scala. This code has been tested in Python, and returns values consistent with Scala

HyukjinKwon · 2017-09-05T15:42:02Z

ok to test

SparkQA · 2017-09-05T16:01:25Z

Test build #81413 has finished for PR 19126 at commit 76e5da7.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

holdenk · 2017-09-06T00:05:05Z

So I'm wondering if @BryanCutler's change for copying the params means that this should only be applied to old branches (2.2)?

BryanCutler · 2017-09-06T00:25:39Z

Yeah, I checked and this is not a problem in master since #17849

srowen · 2017-09-06T07:31:04Z

@marktab you can either close this or reopen vs branch-2.2

marktab · 2017-09-06T14:04:07Z

Thanks @srowen

marktab · 2017-09-06T16:05:46Z

@srowen since I am a new to this review process, should I be seeing the change at http://spark.apache.org/docs/latest/ml-pipeline.html

srowen · 2017-09-06T16:33:00Z

No, no change has been made. @BryanCutler suggests that the change isn't needed in master but could be applied to branch-2.2. To do that you need to reopen a new PR against that branch. It still wouldn't make any existing docs update, but would affect the next release and its docs.

marktab changed the title ~~Model 1 and Model 2 ParamMaps Missing~~ [ML][PySpark]Model 1 and Model 2 ParamMaps Missing Sep 5, 2017

marktab changed the title ~~[ML][PySpark]Model 1 and Model 2 ParamMaps Missing~~ [SPARK-19126][ML][PySpark]Model 1 and Model 2 ParamMaps Missing Sep 5, 2017

marktab changed the title ~~[SPARK-19126][ML][PySpark]Model 1 and Model 2 ParamMaps Missing~~ [SPARK-xxxx][ML][PySpark]Model 1 and Model 2 ParamMaps Missing Sep 5, 2017

marktab changed the title ~~[SPARK-xxxx][ML][PySpark]Model 1 and Model 2 ParamMaps Missing~~ [SPARK-21915][ML][PySpark]Model 1 and Model 2 ParamMaps Missing Sep 5, 2017

srowen approved these changes Sep 5, 2017

View reviewed changes

marktab closed this Sep 6, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-21915][ML][PySpark]Model 1 and Model 2 ParamMaps Missing #19126

[SPARK-21915][ML][PySpark]Model 1 and Model 2 ParamMaps Missing #19126

Uh oh!

marktab commented Sep 5, 2017 •

edited

Loading

Uh oh!

HyukjinKwon commented Sep 5, 2017

Uh oh!

SparkQA commented Sep 5, 2017

Uh oh!

holdenk commented Sep 6, 2017

Uh oh!

BryanCutler commented Sep 6, 2017

Uh oh!

srowen commented Sep 6, 2017

Uh oh!

marktab commented Sep 6, 2017

Uh oh!

marktab commented Sep 6, 2017

Uh oh!

srowen commented Sep 6, 2017

Uh oh!

Uh oh!

[SPARK-21915][ML][PySpark]Model 1 and Model 2 ParamMaps Missing #19126

[SPARK-21915][ML][PySpark]Model 1 and Model 2 ParamMaps Missing #19126

Uh oh!

Conversation

marktab commented Sep 5, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

HyukjinKwon commented Sep 5, 2017

Uh oh!

SparkQA commented Sep 5, 2017

Uh oh!

holdenk commented Sep 6, 2017

Uh oh!

BryanCutler commented Sep 6, 2017

Uh oh!

srowen commented Sep 6, 2017

Uh oh!

marktab commented Sep 6, 2017

Uh oh!

marktab commented Sep 6, 2017

Uh oh!

srowen commented Sep 6, 2017

Uh oh!

Uh oh!

marktab commented Sep 5, 2017 •

edited

Loading