-
Notifications
You must be signed in to change notification settings - Fork 12
Add abstractions for tiered storage of stream data #196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add abstractions for tiered storage of stream data #196
Conversation
This change introduces a behaviour `osiris_log_reader` which can be implemented externally to read from a stream at a given offset spec. This closes over the high-level reading operations `send_file/3` and `chunk_iterator/3`. `osiris:init_reader/4` selects the reader module based on application env, and then callers use `osiris_log_reader` to interact with the reader. By default all of these functions delegate to `osiris_log`. `osiris_log` doesn't need any meaningful changes this way. The only change is to expose the `header_map()` type.
This can be used flexibly to evaluate retention depending on the name or contents of index files. You could pass in a function which would return true until the offset in the index file's name exceeds some given offset for example, as a way to truncate everything up to that offset or to guarantee that the offset (for example an uncommitted one) won't be truncated.
This is important for allowing other manifest implementations (see child commits) to include additional info in the map. The range and epoch offsets are the only required keys and other manifests might include other info. While this change is not complicated, it is breaking and needs backward compatibility clauses in callers of `overview/1` and `init_acceptor/3`. And it ends up moving a lot of code around, inflating the diff.
This change refactors `parse_header/2` to take the chunk header binary and the position at which it was read and return a `header_map()`. This is useful for other readers - so that they do not need to duplicate the binary match code and `next_position` calculation.
`osiris_log_manifest` adds a few callbacks which are executed during initialization of writers and at events like rolling segments and writing chunks. `overview/1` is brought into this behavior as well so that other implementers can extend the `overview()` type (by setting additional keys in the map).
| -type retention_fun() :: fun((IdxFile :: file:filename_all()) -> boolean()). | ||
| -type retention_spec() :: | ||
| {max_bytes, non_neg_integer()} | {max_age, milliseconds()}. | ||
| {max_bytes, non_neg_integer()} | | ||
| {max_age, milliseconds()} | | ||
| {'fun', retention_fun()}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The retention spec part of this PR could probably be ported out on its own. It could also be used to provide a feature unrelated to tiered storage like rabbitmq/rabbitmq-server#14413 - retention up to an offset
| -spec overview(file:filename_all()) -> | ||
| {range(), [{epoch(), offset()}]}. | ||
| overview(). | ||
| overview(Dir) -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change to return a map from overview/1 is a big (but boring) part of the diff - it means lots of changes to osiris_log_SUITE. I think returning a map is the right API here but we will probably need to be careful with it since it's a breaking change. The way the return type is changed now would probably cause errors in mixed-version clusters
This is an evolution to #194 with a different approach to adding the necessary hooks for #184. It's quite different so I wanted to make a new PR to preserve the old history.
The main difference is that the "log reader" part is at a higher level. Instead of
file-like functions with callbacks likeopen/1,pread/3andclose/1we have callbacks more like what's already inosiris_log:init_offset_reader/2,send_file/3,chunk_iterator/3anditerator_next/1. @kjnilsson suggested something higher level when we last discussed the reader. It's much more natural than wrapping file operations after rebasing on #192. Implementors have more responsibilities, but it makes for much less invasive changes intoosiris_logand it doesn't need many changes for callers: just search-and-replaceosiris_logwithosiris_log_reader(see here). It also makes the manifest behaviour less complex and invasive. The manifest mainly provides hooks into events like whenever a chunk is written or a segment file rolled over. It also has hooks into writer and acceptor start-up and closing.Use of these abstractions for S3 tiered storage can be found in amazon-mq/rabbitmq-stream-s3#2 - not everything works well yet but it shows a good sketch of using the reader and manifest behaviors.