-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-32310][ML][PySpark] ML params default value parity in classification, regression, clustering and fpm #29112
Conversation
Test build #125861 has finished for PR 29112 at commit
|
So in theory this shouldn't change behavior, or if it does, it's fixing an incompatibility that's likely more a bug than anything right? |
Yes |
"classification, regression, clustering and fpm" instead of "part 1" in the title? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -68,6 +68,12 @@ private[clustering] trait BisectingKMeansParams extends Params with HasMaxIter | |||
"The minimum number of points (if >= 1.0) or the minimum proportion " + | |||
"of points (if < 1.0) of a divisible cluster.", ParamValidators.gt(0.0)) | |||
|
|||
|
|||
setDefault( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
total nit: make these params in single line, like above places
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Thanks!
@@ -85,7 +85,6 @@ class FMClassifier @Since("3.0.0") ( | |||
*/ | |||
@Since("3.0.0") | |||
def setFactorSize(value: Int): this.type = set(factorSize, value) | |||
setDefault(factorSize -> 8) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do the default params of FMClassifier
move?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I move setDefault for these params to FactorizationMachinesParams in FMRegressor when I fixed solver last time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks okay if we are sure this doesn't change behavior.
@srowen If it's OK with you, I will merge after test passes so I can finish the 2nd part. I will merge to both 3.0.1 and master. |
Test build #125992 has finished for PR 29112 at commit
|
Merged to mater. Thank you all for reviewing! |
What changes were proposed in this pull request?
set params default values in trait ...Params in both Scala and Python.
I will do this in two PRs. I will change classification, regression, clustering and fpm in this PR. Will change the rest in another PR.
Why are the changes needed?
Make ML has the same default param values between estimator and its corresponding transformer, and also between Scala and Python.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Existing tests