Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-32310][ML][PySpark] ML params default value parity in classification, regression, clustering and fpm #29112

Closed
wants to merge 2 commits into from

Conversation

huaxingao
Copy link
Contributor

What changes were proposed in this pull request?

set params default values in trait ...Params in both Scala and Python.
I will do this in two PRs. I will change classification, regression, clustering and fpm in this PR. Will change the rest in another PR.

Why are the changes needed?

Make ML has the same default param values between estimator and its corresponding transformer, and also between Scala and Python.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing tests

@SparkQA
Copy link

SparkQA commented Jul 14, 2020

Test build #125861 has finished for PR 29112 at commit 72c17e9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@huaxingao
Copy link
Contributor Author

cc @srowen @viirya @zhengruifeng

@srowen
Copy link
Member

srowen commented Jul 15, 2020

So in theory this shouldn't change behavior, or if it does, it's fixing an incompatibility that's likely more a bug than anything right?

@huaxingao
Copy link
Contributor Author

So in theory this shouldn't change behavior, or if it does, it's fixing an incompatibility that's likely more a bug than anything right?

Yes

@viirya
Copy link
Member

viirya commented Jul 15, 2020

"classification, regression, clustering and fpm" instead of "part 1" in the title?

@huaxingao huaxingao changed the title [SPARK-32310][ML][PySpark] ML params default value parity part 1 [SPARK-32310][ML][PySpark] ML params default value parity in classification, regression, clustering and fpm Jul 15, 2020
Copy link
Contributor

@zhengruifeng zhengruifeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -68,6 +68,12 @@ private[clustering] trait BisectingKMeansParams extends Params with HasMaxIter
"The minimum number of points (if >= 1.0) or the minimum proportion " +
"of points (if < 1.0) of a divisible cluster.", ParamValidators.gt(0.0))


setDefault(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

total nit: make these params in single line, like above places

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Thanks!

@@ -85,7 +85,6 @@ class FMClassifier @Since("3.0.0") (
*/
@Since("3.0.0")
def setFactorSize(value: Int): this.type = set(factorSize, value)
setDefault(factorSize -> 8)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do the default params of FMClassifier move?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I move setDefault for these params to FactorizationMachinesParams in FMRegressor when I fixed solver last time

Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks okay if we are sure this doesn't change behavior.

@huaxingao
Copy link
Contributor Author

@srowen If it's OK with you, I will merge after test passes so I can finish the 2nd part. I will merge to both 3.0.1 and master.

@SparkQA
Copy link

SparkQA commented Jul 16, 2020

Test build #125992 has finished for PR 29112 at commit 4bc1053.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@huaxingao huaxingao closed this in 383f5e9 Jul 16, 2020
@huaxingao
Copy link
Contributor Author

Merged to mater. Thank you all for reviewing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants