resize FSST compressor buffer to be large enough for largest string#6676
Merged
resize FSST compressor buffer to be large enough for largest string#6676
Conversation
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
e03bc2e to
8304ea8
Compare
robert3005
approved these changes
Feb 25, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes: #6517
In the past, we've used a pre-allocated 16MB buffer for compressing FSST strings, assuming that the largest compressed string will always be able to fit inside of this limit.
However, in practice sometimes that is not the case, such as in the linked issue.
To avoid panics, we will resize the compressed buffer if we encounter a string that may end up exceeding it. Since we are adaptive now and most strings are small, we shrink the default buffer size from 16MB -> 1MB.
Testing
I add a unit test to demonstrate FSST compressing a string that requires resizing the buffer.