Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GzipSource beginnings. #510

Merged
merged 1 commit into from
Feb 8, 2014
Merged

GzipSource beginnings. #510

merged 1 commit into from
Feb 8, 2014

Conversation

swankjesse
Copy link
Collaborator

Needs tests in a follow up. In particular for the new
methods on OkBuffer (peekByte, skip, indexOf) and for
all of the weird header cases on gzip data (headers
that include names, comments, extra fields, header
CRCs), plus failing cases for header CRC, CRC and
ISIZE.

@codefromthecrypt
Copy link

OkHttp seems to be a place where binary formats can hang out in new clothes

}

/** Fills the buffer with at least {@code byteCount} bytes. */
private void require(int byteCount, Deadline deadline) throws IOException {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice. I think I've wanted this elsewhere.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah. One catch with making it general is the policy on over-reading. This class will read more data than it needs to; in some situations that could be bad. (Because there's no mechanism to push back bytes you didn't need.)

@codefromthecrypt
Copy link

only editorial comments beyond the ones you've made. I'm heads down until my talk tomorrow, else I'd finish this for us!

@codefromthecrypt
Copy link

@swankjesse squash the commit I added if you like. should be good to go

@codefromthecrypt
Copy link

one sec. forgot the header crc test.

@codefromthecrypt
Copy link

ok all tests are backfilled. One thing interesting is that it seems a lot of gzip implementations in the wild use an old version of gzip which thinks the HCRC flag is continuation. Very few implement either. For example, if you compress with the HCRC and read it in vanilla osx gzip, it thinks the thing is multipart!!

echo H4sIAgAAAAAAAB0m8yxRL1ZIVAj184xQKK4sLknNVVTwVMjOyy9XKMnILFYEAI2PrTcgAAAA|base64 --decode|gunzip -l -v -gzip: stdin is a a multi-part gzip file -- not supported

/**
* Allows us to customize a gzip impl someone else wrote so that we can test our implementation.
*/
private static class GZIPOutputStream extends java.util.zip.GZIPOutputStream {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd prefer to just create some golden base64/hex gzip files than generate them on the fly. With this we're really hacking GzipOutputStream in a way that isn't too natural.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(You should use this mechanism to create those files)

@swankjesse
Copy link
Collaborator Author

LGTM

gunzip(gzipped);
fail();
} catch (IOException e) {
assertEquals("FHCRC: actual 0x00261d != expected 0x000000", e.getMessage());
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. I was expecting the string format to print 8 hex chars plus 0x, but it included the 0x in the 8 character width.

@codefromthecrypt
Copy link

okie rewrote the test to use constants and squashed

@@ -195,7 +241,7 @@ void write(byte[] data, int offset, int byteCount) {
tail.limit += toCopy;
}

this.byteCount += data.length;
this.byteCount += byteCount;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a whoops :)

swankjesse added a commit that referenced this pull request Feb 8, 2014
@swankjesse swankjesse merged commit 1b25214 into master Feb 8, 2014
@swankjesse swankjesse deleted the jwilson_0203_gzip_beginnings branch February 14, 2014 22:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants