Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR ArcRecordUtils - Read 1224 bytes but expected 1300 bytes #128

Closed
ruebot opened this issue Nov 30, 2017 · 11 comments
Closed

ERROR ArcRecordUtils - Read 1224 bytes but expected 1300 bytes #128

ruebot opened this issue Nov 30, 2017 · 11 comments
Assignees

Comments

@ruebot
Copy link
Member

ruebot commented Nov 30, 2017

This is coming up a lot in the build.

Example:

2017-11-30 09:19:54,601 [Executor task launch worker for task 0] ERROR ArcRecordUtils - Read 1224 bytes but expected 1300 bytes. Continuing...
WARNING Record STARTING at 0 has 1761 trailing byte(s): file:/home/nruest/git/aut/target/test-classes/arc/badexample.arc.gz: {subject-uri=filedesc://IAH-20080430204825-00000-blackbook.arc, ip-address=0.0.0.0, origin=InternetArchive, length=1300, absolute-offset=0, creation-date=20080430204825, content-type=text/plain, version=1.1}
@ianmilligan1
Copy link
Member

We had this back in Warcbase - see issue here lintool/warcbase#199.

According to this it should have been fixed in webarchive-commons (1.1.7). @jrwiebe did quite a bit of digging on this problem.

@ruebot
Copy link
Member Author

ruebot commented Nov 30, 2017

@anjackson any chance this resurfaced in webarchive-commons 1.1.8?

@anjackson
Copy link
Contributor

@ruebot Possibly, although the changeset doesn't seem to affect the original fix. Any chance you've got an older version on the classpath somewhere?

@ruebot
Copy link
Member Author

ruebot commented Nov 30, 2017

+- org.netpreserve.commons:webarchive-commons:jar:1.1.8:compile is the only webarchive-commons package that comes up with mvn dependency:tree.

@anjackson
Copy link
Contributor

anjackson commented Nov 30, 2017

Hmmm, I'm confused. Your version of ArcRecordUtils doesn't have the additional code @jrwiebe added to cope with this situation.

Or am I misunderstanding the issue?

@ruebot
Copy link
Member Author

ruebot commented Nov 30, 2017

Oh, that weird.

@ruebot
Copy link
Member Author

ruebot commented Nov 30, 2017

@lintool https://github.com/lintool/warcbase/blob/7bdc8b55fdbf96a6fd5a246de761fb344563ce1f/src/main/java/org/warcbase/data/ArcRecordUtils.java#L61-L66 vs https://github.com/lintool/warcbase/commits/master/warcbase-core/src/main/java/org/warcbase/data/ArcRecordUtils.java

Do you know what happen to the git history there? When you split it up to warcbase-core, do you remember if you used git mv or just mv? And, did @jrwiebe's work get refactored in what we have now, or removed?

@ianmilligan1
Copy link
Member

I think that predates the separation btw warcbase-core and other.

But aha, the branch @jrwiebe was working on was never merged. Here's the differences there: https://github.com/lintool/warcbase/compare/arc-tobytes

@ianmilligan1
Copy link
Member

Shoot, I wish we'd merged this.

After @jrwiebe did this branch, we then had the split between warcbase-core and warcbase-hbase, and then subsequent i/o handling changes.

I don't think there's a straightforward way to port this stuff over (without knowing Java, which I don't).

Another option is to suppress some of the error handling by default on expecting X bytes but getting Y bytes.

@jrwiebe
Copy link
Contributor

jrwiebe commented Nov 30, 2017

Since we know why that error is occurring, you could fairly safely suppress the error for now -- perhaps only for cases where the difference between bytes-read and bytes-expected is 76 (see (lintool/warcbase#199 (comment)) -- until someone has time to merge my branch.

@ruebot
Copy link
Member Author

ruebot commented Dec 4, 2017

Resolved with 3eb093a

@ruebot ruebot closed this as completed Dec 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants