Skip to content

[SPARK-23234][ML][PYSPARK] Remove setting defaults on Java params #20410

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 2 additions & 3 deletions python/pyspark/ml/wrapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -118,10 +118,9 @@ def _transfer_params_to_java(self):
"""
Transforms the embedded params to the companion Java object.
"""
paramMap = self.extractParamMap()
for param in self.params:
if param in paramMap:
pair = self._make_java_param_pair(param, paramMap[param])
if param in self._paramMap:
pair = self._make_java_param_pair(param, self._paramMap[param])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent would be more clear if you called

if self.isSet(param):
    pair = self._make_java_param_pair(param, self.getOrDefault[param])

And to be 100% consistent with Scala, it might be a good idea to transfer default values anyway. That way, just in case the python default was somehow changed or different than scala it wouldn't cause an issue that would be really hard to detect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your comment. There is only one problem: you can't transfer the default values to Scala without using set. Indeed, setDefault is protected, so it can't be called by Python.
Moreover, we already have test cases which ensures that the defaults in Python and Scala are the same. And since the user can't change a default, we are on the safe side. As I said before, I think that a good next step would be to read the defaults from Scala in Python and this will make us sure that they will be always consistent.
Thanks for the suggestion, but I don't really agree that using getOrDefault would make the intent more clear: on the opposite, I think it might be confusing. I can use isSet as you suggested, instead, or I can add a comment, what do you think?

self._java_obj.set(pair)

def _transfer_param_map_to_java(self, pyParamMap):
Expand Down