Add abstractions for tiered storage of stream data #196

the-mikedavis · 2025-10-14T15:52:15Z

This is an evolution to #194 with a different approach to adding the necessary hooks for #184. It's quite different so I wanted to make a new PR to preserve the old history.

The main difference is that the "log reader" part is at a higher level. Instead of file-like functions with callbacks like open/1, pread/3 and close/1 we have callbacks more like what's already in osiris_log: init_offset_reader/2, send_file/3, chunk_iterator/3 and iterator_next/1. @kjnilsson suggested something higher level when we last discussed the reader. It's much more natural than wrapping file operations after rebasing on #192. Implementors have more responsibilities, but it makes for much less invasive changes into osiris_log and it doesn't need many changes for callers: just search-and-replace osiris_log with osiris_log_reader (see here). It also makes the manifest behaviour less complex and invasive. The manifest mainly provides hooks into events like whenever a chunk is written or a segment file rolled over. It also has hooks into writer and acceptor start-up and closing.

Use of these abstractions for S3 tiered storage can be found in amazon-mq/rabbitmq-stream-s3#2 - not everything works well yet but it shows a good sketch of using the reader and manifest behaviors.

This change introduces a behaviour `osiris_log_reader` which can be implemented externally to read from a stream at a given offset spec. This closes over the high-level reading operations `send_file/3` and `chunk_iterator/3`. `osiris:init_reader/4` selects the reader module based on application env, and then callers use `osiris_log_reader` to interact with the reader. By default all of these functions delegate to `osiris_log`. `osiris_log` doesn't need any meaningful changes this way. The only change is to expose the `header_map()` type.

This can be used flexibly to evaluate retention depending on the name or contents of index files. You could pass in a function which would return true until the offset in the index file's name exceeds some given offset for example, as a way to truncate everything up to that offset or to guarantee that the offset (for example an uncommitted one) won't be truncated.

This is important for allowing other manifest implementations (see child commits) to include additional info in the map. The range and epoch offsets are the only required keys and other manifests might include other info. While this change is not complicated, it is breaking and needs backward compatibility clauses in callers of `overview/1` and `init_acceptor/3`. And it ends up moving a lot of code around, inflating the diff.

This change refactors `parse_header/2` to take the chunk header binary and the position at which it was read and return a `header_map()`. This is useful for other readers - so that they do not need to duplicate the binary match code and `next_position` calculation.

`osiris_log_manifest` adds a few callbacks which are executed during initialization of writers and at events like rolling segments and writing chunks. `overview/1` is brought into this behavior as well so that other implementers can extend the `overview()` type (by setting additional keys in the map).

the-mikedavis · 2025-10-14T15:58:24Z

src/osiris.erl

+-type retention_fun() :: fun((IdxFile :: file:filename_all()) -> boolean()).
 -type retention_spec() ::
-    {max_bytes, non_neg_integer()} | {max_age, milliseconds()}.
+    {max_bytes, non_neg_integer()} |
+    {max_age, milliseconds()} |
+    {'fun', retention_fun()}.


The retention spec part of this PR could probably be ported out on its own. It could also be used to provide a feature unrelated to tiered storage like rabbitmq/rabbitmq-server#14413 - retention up to an offset

the-mikedavis · 2025-10-14T16:16:42Z

src/osiris_log.erl

 -spec overview(file:filename_all()) ->
-    {range(), [{epoch(), offset()}]}.
+    overview().
 overview(Dir) ->


The change to return a map from overview/1 is a big (but boring) part of the diff - it means lots of changes to osiris_log_SUITE. I think returning a map is the right API here but we will probably need to be careful with it since it's a breaking change. The way the return type is changed now would probably cause errors in mixed-version clusters

the-mikedavis added 5 commits October 13, 2025 11:28

the-mikedavis requested review from SimonUnge and lukebakken October 14, 2025 15:52

the-mikedavis self-assigned this Oct 14, 2025

This was referenced Oct 14, 2025

Add reader and manifest abstractions for osiris_log #194

Closed

file API for Osiris reads. #190

Closed

the-mikedavis linked an issue Oct 14, 2025 that may be closed by this pull request

Tiered Storage Support for RabbitMQ Streams #184

Open

the-mikedavis commented Oct 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add abstractions for tiered storage of stream data #196

Add abstractions for tiered storage of stream data #196

Uh oh!

the-mikedavis commented Oct 14, 2025

Uh oh!

the-mikedavis Oct 14, 2025

Uh oh!

the-mikedavis Oct 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add abstractions for tiered storage of stream data #196

Are you sure you want to change the base?

Add abstractions for tiered storage of stream data #196

Uh oh!

Conversation

the-mikedavis commented Oct 14, 2025

Uh oh!

the-mikedavis Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

the-mikedavis Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant