Skip to content

[Improvement]: Object Serialization Optimization and Support #3355

Open
@czy006

Description

@czy006

Search before asking

  • I have searched in the issues and found no similar issues.

What would you like to be improved?

Currently, we use Java serialization and Kyro serialization. This serialization method may have some issues, including low performance, We use Kyro serialization for PUT and GET operations on Rocksdb, which is a lookup join feature in Mixed Format

In the objects we store in the database, we also need to serialize and deserialize. During the upgrade process, we occasionally encounter deserialization errors and issues (as shown in the figure below)

Through research, we found that Apache Fury can improve serialization performance and solve deserialization problems. We will provide performance testing reports in the future to compare before and after replacement

How should we improve?

  • Abstract resource serialization interface, implementation of native serialization in current Java, implementation of Kyro serialization
  • Implement Fury serialization and provide configuration options for Fury serialization, while marking other serialization methods as expired
  • When the Amoro LTS version is completed, we will remove the implementation of Kyro serialization

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Subtasks

No response

Code of Conduct

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions