-
Notifications
You must be signed in to change notification settings - Fork 111
Add partition-store schema metadata accessors #3851
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
tillrohrmann
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for creating this PR @muhamadazmy. The technical changes of this PR look good to me. One question I had was about the requirements for the stored schema. Do we need to access only the latest know schema or are there cases where we need to access older versions as well?
| &mut self, | ||
| ) -> impl Future<Output = Result<Option<PartitionDurability>>> + Send + '_; | ||
|
|
||
| fn get_schema(&mut self) -> impl Future<Output = Result<Option<Schema>>> + Send + '_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are our requirements for storing the schema? Do we only need to have access to the latest known schema version or do we also need to be able to access earlier versions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tillrohrmann for your review. The current requirement is only to serve for the concurrency control RIP document. So far it's enough to track only the latest known version of the schema as long as it's consistent across all replicas. The idea is that queuing control will depend heavily on user settings which will be available through the schema. In the future their might rise the need to track older version of the schema then it's probably okay to introduce this later if needed.
That being said, this schema "record" need to be readable by every subsequent version of restate hence it is not a good idea to use the flexbuffer serialization format in this case. I am waiting on @slinkydeveloper to finish some refactor work over the schema before I adapt those changes and see if it's better to move to bilrost or build some shadow type specifically for the wal record
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flexbuffer serialization format in this case.
Why flexbuffers is problematic here? If possible, it would be nice to retain the json like properties for this schema, as it makes easy to evolve them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because flexbuffers has no means of safely evolving the schema type in a backward compatible way. Unlike bilrost and protobuf. We need to be careful to always make sure type changes are backward compatible. Which has been confirmed when I synced with you on this. So we can safely ignore this comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because flexbuffers has no means of safely evolving the schema type in a backward compatible way.
I'm not sure I follow this point, at the end of the day backward compatibility is guaranteed by what we put in there, and whether we make fields optional or not and how we depend on those fields being there or not. The point of using one format rather than another, I don't follow this one.
My only concern is the following: let's try to use just one encoding for Schema, be it bilrost, protobuf, or flexbuffers I don't really mind. But having multiple encodings, one when it's inside the metadata store and one when it's inside the PP, that's a guaranteed footgun for us.
AhmedSoliman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change looks good, I've left a few minor comments on naming but nothing that can't be on a follow up PR.
|
|
||
| /// Schema versions are represented as a strictly monotonically increasing number. | ||
| /// This represent the partition storage schema version, not the user services schema. | ||
| pub(crate) const SCHEMA_VERSION: u64 = 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps this should be renamed to storage version or something.
| } | ||
|
|
||
| #[inline] | ||
| fn put_kv<K: TableKey, V: PartitionStoreProtobufValue + Clone + 'static>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe this should be put_kv_proto for consistency.
## Summary - add a dedicated metadata slot and read/write helpers for storing service schemas in the partition-store FSM table - extend the storage-api FSM table traits with schema accessors so callers can persist schema metadata
Add partition-store schema metadata accessors
Summary
Stack created with Sapling. Best reviewed with ReviewStack.