Skip to content

[SPARK-17679] [PYSPARK] remove unnecessary Py4J ListConverter patch #15254

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

JasonMWhite
Copy link
Contributor

What changes were proposed in this pull request?

This PR removes a patch on ListConverter from #5570, as it is no longer necessary. The underlying issue in Py4J py4j/py4j#160 was patched in py4j/py4j@224b94b and is present in 0.10.3, the version currently in use in Spark.

How was this patch tested?

The original test added in #5570 remains.

@JasonMWhite
Copy link
Contributor Author

@davies you authored #5570 and reported the issue in Py4J py4j/py4j#160. I happened across this while spelunking through Py4J code in PySpark, it seems like it's no longer needed. Do you agree?

@lins05
Copy link
Contributor

lins05 commented Sep 27, 2016

I guess we can also remove another workaround here ?

@rxin
Copy link
Contributor

rxin commented Sep 27, 2016

cc @JoshRosen and @davies

@JoshRosen
Copy link
Contributor

+1 on @lins05's suggestion of going further and removing any non-unnecessary explicit usages of ListConverter; we might as well clean up all of this in one shot, if possible.

@JasonMWhite
Copy link
Contributor Author

@JoshRosen @lins05 As requested, I've removed all remaining explicit mentions of ListConverter and MapConverter as they seemed to all be doing the same thing - getting around py4j/py4j#161 and py4j/py4j#160.

I'm not familiar with the code in pyspark-ml and pyspark-mllib, but it seemed straightforward and didn't introduce any regressions in the tests.

Copy link
Contributor

@lins05 lins05 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The purpose of the workarounds was to:

  1. avoid auto converting bytearray to java ArrayList (fixed in auto_convert does not work with bytearray py4j/py4j#160)
  2. avoid the bug trigged when calling java class constructor with non-JavaObject args (fixed in JavaClass does not work with auto_convert py4j/py4j#161)

Since both bugs have been fixed, I think we are safe to remove the workarounds.

@JoshRosen
Copy link
Contributor

Jenkins, this is ok to test.

@davies
Copy link
Contributor

davies commented Sep 28, 2016

lgtm, pending on jenkins.

@SparkQA
Copy link

SparkQA commented Sep 28, 2016

Test build #66045 has finished for PR 15254 at commit 1e05073.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@holdenk
Copy link
Contributor

holdenk commented Oct 2, 2016

+1 :)

@davies
Copy link
Contributor

davies commented Oct 3, 2016

Merging this into master, thanks!

@asfgit asfgit closed this in 1f31bda Oct 3, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants