-
Notifications
You must be signed in to change notification settings - Fork 9.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GzipSource beginnings. #510
Conversation
OkHttp seems to be a place where binary formats can hang out in new clothes |
} | ||
|
||
/** Fills the buffer with at least {@code byteCount} bytes. */ | ||
private void require(int byteCount, Deadline deadline) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice. I think I've wanted this elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah. One catch with making it general is the policy on over-reading. This class will read more data than it needs to; in some situations that could be bad. (Because there's no mechanism to push back bytes you didn't need.)
only editorial comments beyond the ones you've made. I'm heads down until my talk tomorrow, else I'd finish this for us! |
@swankjesse squash the commit I added if you like. should be good to go |
one sec. forgot the header crc test. |
ok all tests are backfilled. One thing interesting is that it seems a lot of gzip implementations in the wild use an old version of gzip which thinks the HCRC flag is continuation. Very few implement either. For example, if you compress with the HCRC and read it in vanilla osx gzip, it thinks the thing is multipart!! echo H4sIAgAAAAAAAB0m8yxRL1ZIVAj184xQKK4sLknNVVTwVMjOyy9XKMnILFYEAI2PrTcgAAAA|base64 --decode|gunzip -l -v -gzip: stdin is a a multi-part gzip file -- not supported |
/** | ||
* Allows us to customize a gzip impl someone else wrote so that we can test our implementation. | ||
*/ | ||
private static class GZIPOutputStream extends java.util.zip.GZIPOutputStream { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd prefer to just create some golden base64/hex gzip files than generate them on the fly. With this we're really hacking GzipOutputStream in a way that isn't too natural.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(You should use this mechanism to create those files)
LGTM |
gunzip(gzipped); | ||
fail(); | ||
} catch (IOException e) { | ||
assertEquals("FHCRC: actual 0x00261d != expected 0x000000", e.getMessage()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. I was expecting the string format to print 8 hex chars plus 0x
, but it included the 0x
in the 8 character width.
okie rewrote the test to use constants and squashed |
@@ -195,7 +241,7 @@ void write(byte[] data, int offset, int byteCount) { | |||
tail.limit += toCopy; | |||
} | |||
|
|||
this.byteCount += data.length; | |||
this.byteCount += byteCount; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a whoops :)
Needs tests in a follow up. In particular for the new
methods on OkBuffer (peekByte, skip, indexOf) and for
all of the weird header cases on gzip data (headers
that include names, comments, extra fields, header
CRCs), plus failing cases for header CRC, CRC and
ISIZE.