Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support compression on serialization of big values #3324

Closed
kostasrim opened this issue Jul 16, 2024 · 7 comments
Closed

Support compression on serialization of big values #3324

kostasrim opened this issue Jul 16, 2024 · 7 comments
Assignees

Comments

@kostasrim
Copy link
Contributor

Currently the protocol on rdb loader does not support compressed data and it breaks.

Previously, we only called FlushToSink in SaveEpilogue and before SaveBody (to flush previous entries). However, with the current changes, we call FlushToSink when we serialize X number of bytes. This doesn't work with the protocol implemented in rdb_load because we now split let's say a list in multiple chunks and LoadKeyValue on RdbLoader expects to read a string (for the values) but it gets a compressed blob and fails with unrecognized rdb type.

Extend the protocol to support loading compressed data in LoadKeyValuePair of the rdb_loader

@romange
Copy link
Collaborator

romange commented Jul 16, 2024

@kostasrim I suggest using RDB_OPCODE_DF_MASK for that. I't a meta opcode that can say in advance what to expect in keyvalue that is gonna be parsed next. Right now we only have DF_MASK_FLAG_STICKY bit, so you can extend with compression bits like DF_MASK_COMPRESSED_ZSTD or DF_MASK_COMPRESSED_LZ4 and then know on replica side to treat the value accordingly.

@kostasrim
Copy link
Contributor Author

@romange LIT

@chakaz
Copy link
Collaborator

chakaz commented Jul 17, 2024

@kostasrim does that mean that, once #3241 is merged, older versions of Dragonfly won't be able to read RDB/DFS saved with new Dragonfly versions?
If so, we need to disable that for at least one version to allow rollbacks IMO

@kostasrim
Copy link
Contributor Author

@chakaz #3241 won't affect compression, but this PR will. If we extend the protocol as suggested by Roman older versions won't work and will break

@chakaz
Copy link
Collaborator

chakaz commented Jul 17, 2024

For any such breaking changes, we should first disable (by default, toggle-able via a flag) the write path, and have the read path support it. It should be like that for at least 1 version, and only then we can enable it by default (or even remove the flag if makes sense)

@adiholden
Copy link
Collaborator

I synced with Shahar and he brought up that there should be an api for continues compression. I worked on this CompressorImpl class in the past, and I think we dont need to break changes. If we change the CompressorImpl class to create context and free the context once we finish full entry compression I believe this should be good. This is like breaking the compression to many frames that will be sent one after the other and with last frame once we finished serializing the entry

@kostasrim
Copy link
Contributor Author

We decided to disable compression on big values and we will revisit this for optimization if needed in the future. I am closing this for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants