-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add ZstdInputStream #1
base: preview-master
Are you sure you want to change the base?
Conversation
@jonmv can we go through this together at some point? i have some thoughts but very uncertain about best direction. |
Of course! Anything I should look at to prepare? I'm not familiar with zstd compression, so perhaps a whitepaper of sorts? |
look at the simple (non-streaming) decompressor in this library first: for the format itself, this is quite readable: https://github.com/facebook/zstd/blob/dev/doc/zstd_compression_format.md |
inputPosition = 0; | ||
} | ||
else { | ||
int newSize = (inputBuffer.length * 2 + size) & BUFFER_SIZE_MASK; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be 0
if the new size is otherwise too small, which can happen if the constructor specifying an input buffer size is used, with a small argument. I also didn't check for the possibility of overflow.
int magic = bb.getInt(); | ||
if ((magic >= 0x184D2A50) && (magic <= 0x184D2A5F)) { | ||
inputPosition += SIZE_OF_INT; | ||
skipBytes = bb.getInt(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be interpreted as an unsigned int.
} | ||
else { | ||
skipBytes -= inputAvailable(); | ||
inputPosition = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing update to evictedInput
? Perhaps a method for decreasing input position (and incrementing evicted counter) is a good idea?
outputPosition = 0; | ||
outputEnd = 0; | ||
if (singleSegmentFlag) { | ||
windowSize = (int) curHeader.contentSize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
contentSize
is an unsigned long, but we can't reliably use that as window size.
ensureOutputSpace(windowSize); | ||
} | ||
else { | ||
windowSize = curHeader.windowSize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise, windowSize
could also overflow an (unsigned) int.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The library already enforces windowSize < 1 << 23
, so I guess this could be used here as well.
{ | ||
if (outputSpace() < size) { | ||
check(outputAvailable() == 0, "logic error"); | ||
if (windowSize * 4 + size < outputPosition) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overflow?
int newSize = (outputBuffer.length | ||
+ windowSize * 4 | ||
+ size | ||
+ DEFAULT_BUFFER_SIZE) & BUFFER_SIZE_MASK; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overflow?
blockDecompressor = null; | ||
if (contentChecksumFlag) { | ||
check(inputAvailable() >= SIZE_OF_INT, "missing checksum data"); | ||
inputPosition += SIZE_OF_INT; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intentionally skipped for now?
* Pick out bits with binary literals, for readability * Simplify making room for output * Read block headers with binary literals * A couple of comments, and access static method through class
preview of PR