-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-34023][BUILD] Updated to Kryo 5.0.3 #31059
Conversation
ok to test |
Thank you for making a PR, @nicknezis . |
I created SPARK-34023 for this. |
cc @srowen |
@nicknezis Are there any exciting new features in Kryo 5.x? |
Test build #133724 has finished for PR 31059 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this actually updates the kryo version used? the pom.xml files were not changed.
This could be fine, but we'll have to see if it breaks anything once Spark actually uses it.
The migration guide is at https://github.com/EsotericSoftware/kryo/wiki/Migration-to-v5
For example requiring registration by default may be a breaking change that we have to explicitly disable.
Yes I was a little bit unsure about this. The pom.xml files didn't seem to reference a version. And the previous PR #22179 that updated the version to 4.0.2 seemed to only touch these files. If you can let me know where the version information is actually stored, I can update that and redo any testing and follow-up code edits. |
Yeah I'm pretty sure because it only comes in transitively via chill. Updating it might be a good thing, but, might break chill. Note that chill uses a shaded version of kryo. You can try to override the kryo-shaded dependency in the pom.xml file, but maybe better to see if there is a newer chill release that already uses it? |
Ahh OK perfect. And is Chill the only thing that's bringing it in as a transitive dependency? I'll do more research down that path and look into upgrading it. I recently did the upgrade for Apache Heron and it wasn't that big of a change. If it isn't a simple chill version bump, should we close this pull request and keep SPARK-34023 for me to provide updates? |
It looks like it's just chill. Spark doesn't use it directly though depends on its behavior. 0.9.5 looks like the latest chill release though. It may Just Work if you bump the kryo version, and could be worth a shot. If so that could be fine; only downside is now we manually manage the kryo version too, but, maybe that's OK. Yes keep the JIRA open / update the title/description. |
Kubernetes integration test starting |
Oh, I think I'm going to try to update Chill itself to support the newer Kryo. Then we can just bump the version of Chill in Spark. I don't think we should add the specific version of Kryo in Spark if it's not needed. I assume those dep files I manually updated are dynamically generated by some process? Is this something I would need to do as part of the process in updating Chill? This would explain why I was mistaken when referencing the older pull request that mentioned updating Kryo to 4.0.2. |
Kubernetes integration test status success |
Yes they're generated by the build process to detect unexpected changes in dependencies. There is a script in dev/ that can update it. But the file itself doesn't do anything beyond that check. |
Test build #133757 has finished for PR 31059 at commit
|
Thanks for working on this, @nicknezis !
I'd like to see if this change can actually improves performance. If possible, could you run the kryo benchmark? |
Until I get the opportunity to run the Spark benchmark, this Kryo benchmark might give some insight into what the changes might look like. |
If updating chill means updating to kryo 5 and that means changing some imports in Spark, that's probably OK. Yes, kryo ends up in the user app namespace and that's a change that could be visible, but I don't think we expect users to rely on kryo directly. Still that might be a point up for discussion. |
I'm sorry I lost steam on this. I plan to revisit, but got stuck in a morass of updating Chill, Storm and other dependencies. I think I have the Chill dependency updated. Working through the Spark changes now. |
What changes were proposed in this pull request?
Updating Kryo to a newer version. Release information can be found here: https://github.com/EsotericSoftware/kryo/releases
Why are the changes needed?
Kryo 4.0.2 was released in March 2018. This 5.X release has many new features and performance improvements.
Does this PR introduce any user-facing change?
No
How was this patch tested?
mvn clean test