-
Notifications
You must be signed in to change notification settings - Fork 27
feat: chunked log segment manager #218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Switch to published version of Kafka deps
This commit introduces chunk transformations, that are the foundation of the chunking itself (both for upload and download) and also of the optional encryption and compression.
Add chunk transformations
feat: add custom S3 endpoint URL config
This commit adds JSON (Jackson-based) (de-)serialization to the chunk index classes and everything that is needed for this. Most notably, it adds a compact binary codec for chunks lists present in variable size indices.
Add chunk index serialization
They are added for tests mostly. In any case, these objects are not supposed to be compared on a hot path, so an easy but producing more garbage implementation was selected.
These interfaces are mainly supposed to be implemented with S3, GCS, and other object storage implementations, which are to be done. The file system implementation is mostly for testing. To put this in the future context, the plugin will instantiate concrete implementations of `ObjectStorageFactory`, which will be S3, GCS, and others.
Add object storage interfaces and file system implementation
This includes also a secret encryption/decryption on serialization/deserialization.
Add segment manifest and its (de-)serialization
changes: rename tieredstorage, move chunk to root, rename index module, metadata moved to security
refactor: reorg commons module
Streamline encryption key and AAD generation
Add config for UniversalRemoteStorageManager
To clarify how Config will pass from framework to plugin
feat: segment compression checker
fix: remove parent directories to mimic s3 behaviour
Add and fix some toString
…ed-fields Require fields in JSON deserialization of chunk indices
fix: chunk manager: return input stream instead of future
This doesn't make sense and also cause division by 0 in `FixedSizeChunkIndex.chunkCount`
…lChunkSize-positive Don't allow originalChunkSize to be 0
feat: add fetch chunk transform
feat: adding ChunkManager implementation
Simplifying chunk fetch to avoid unnecessary off-by-one error.
feat: add object range for fetcher
refactor: remove overwrite flag for FileSystem storage
UniversalRemoteStorageManager implementation
Will reopen if #217 is merged. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Following up #217
Expands
ChunkManager
intoChunkedLogSegmentManager
and absorbstransform/FetchChunkTransform
to consolidate interactions betweenTransformPipeline
(see #217) and Storage backend to implement URSM requests.Dependencies would flow: