-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add basic wal implementation for Edge #24570
Conversation
This WAL implementation uses some of the code from the wal crate, but departs pretty significantly from it in many ways. For now it uses simple JSON encoding for the serialized ops, but we may want to switch that to Protobuf at some point in the future. This version of the wal doesn't have its own buffering. That will be implemented higher up in the BufferImpl, which will use the wal and SegmentWriter to make data in the buffer durable. The write flow will be that writes will come into the buffer and validate/update against an in memory Catalog. Once validated, writes will get buffered up in memory and then flushed into the WAL periodically (likely every 10-20ms). After being flushed to the wal, the entire batch of writes will be put into the in memory queryable buffer. After that responses will be sent back to the clients. This should reduce the write lock pressure on the in-memory buffer considerably. In this PR: - Update the Wal, WalSegmentWriter, and WalSegmentReader traits to line up with new design/understanding - Implement wal (mainly just a way to identify segment files in a directory) - Implement WalSegmentWriter (write header, op batch with crc, and track sequence number in segment, re-open existing file) - Implement WalSegmentReader
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly I just had a few questions that would help me with the review before I feel comfortable to approve it, but otherwise to me this looks really solid.
const FILE_TYPE_IDENTIFIER: &[u8] = b"idb3.001"; | ||
|
||
/// File extension for segment files | ||
const SEGMENT_FILE_EXTENSION: &str = "wal"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I might have been confused about our previous offline conversations. Are the segment files and wal one and the same? My understanding was that the wal just contained data that had not been persisted yet (with maybe 1 or more segments of nonpersisted data) and a segment that was persisted to disk was a blob of data containing the list of all the parquet files for that segment. To me the terminology feels a bit fuzzy and intermingled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The wal contains all data that is in the in-memory buffer for a given segment. So there are a few different things here:
- A Buffer Segment (in memory collection of writes)
- A WAL Segment (a file on locally attached disk that has the durable record of what is in a buffer segment)
- A Segment File (a file in object store that has the summary information of what parquet files were persisted for a given buffer segment)
The Segment File could arguably be renamed to something more like segment_persist_info
or something like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay that makes more sense to me. I think when I work on the persister I'll give Segment Files a name like that to be more clear
Turn wal and write buffer references into a concrete type, rather than dyn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thanks for making those changes @pauldix!
This WAL implementation uses some of the code from the wal crate, but departs pretty significantly from it in many ways. For now it uses simple JSON encoding for the serialized ops, but we may want to switch that to Protobuf at some point in the future. This version of the wal doesn't have its own buffering. That will be implemented higher up in the BufferImpl, which will use the wal and SegmentWriter to make data in the buffer durable.
The write flow will be that writes will come into the buffer and validate/update against an in memory Catalog. Once validated, writes will get buffered up in memory and then flushed into the WAL periodically (likely every 10-20ms). After being flushed to the wal, the entire batch of writes will be put into the in memory queryable buffer. After that responses will be sent back to the clients. This should reduce the write lock pressure on the in-memory buffer considerably.
In this PR:
Closes #24557