-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Add support for LZ4 #590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for LZ4 #590
Conversation
Add lz4 source tarball for built-in version. Tweak Event example to make it easier to compare compression algorithms. Tweak build files for Event example to include TreePlayer (for TTreePerfStats). Until we're sure we need it, no special path for Win32. Add -fPIC flag to compile lz4
|
Can one of the admins verify this patch? |
|
@phsft-bot build! |
|
Starting build on |
vgvassilev
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this!
|
|
||
|
|
||
| #---Check for LZ4-------------------------------------------------------------------- | ||
| if(NOT builtin_lz4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we transform the two similar if-stmts into one if-else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not easily: the true-branch of the first if-statement can affect the outcome of the second conditional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. That part is tricky to read. Maybe we could add a clarification comment for the casual reader.
core/lz4/src/ZipLZ4.c
Outdated
| // Original Author: Brian Bockelman | ||
|
|
||
| /************************************************************************* | ||
| * Copyright (C) 1995-2015, Rene Brun and Fons Rademakers. * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we update the copyright year?
core/lz4/src/ZipLZ4.c
Outdated
| #include <stdint.h> | ||
|
|
||
| #if (__GNUC__ >= 3) || defined(__INTEL_COMPILER) | ||
| #if !defined(R__unlikely) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we rely on RConfig that would do this for us?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup - cleaning that up as part of the response to clang-tidy's other suggestions.
test/MainEvent.cxx
Outdated
|
|
||
| TFile *hfile; | ||
| TTree *tree; | ||
| TTreePerfStats *ioperf = NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use a nullptr.
|
Build failed on slc6/gcc62. Failing tests: |
|
Build failed on mac1012/native. Warnings:
Failing tests: |
core/zip/src/Bits.h
Outdated
| return; | ||
| R__zipLZMA(cxlevel, srcsize, src, tgtsize, tgt, irep); | ||
| return; | ||
| } else if (compressionAlgorithm == 4) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add and use a constant (like kLZMA) ...
test/MainEvent.cxx
Outdated
| nevent = TMath::Min(nevent,nentries); | ||
| if (read == 1) { //read sequential | ||
| //by setting the read cache to -1 we set it to the AutoFlush value when writing | ||
| ioperf = new TTreePerfStats("Perf Stats", tree); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make the use of TTreePerfStats optional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? I don't understand; seems to be harmless enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The data collection takes 'some' time and thus if one wants to externally merge 'pure' performance one would want to run without.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also the comment on line 189 belongs next to line 191
test/Makefile
Outdated
| EVENTO = Event.$(ObjSuf) EventDict.$(ObjSuf) | ||
| EVENTS = Event.$(SrcSuf) EventDict.$(SrcSuf) | ||
| EVENTSO = libEvent.$(DllSuf) | ||
| EVENTLIBS = -lTreePlayer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An equivalent is needed in the windows section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not following: this is in a platform-neutral part of the Makefile, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes but -l is not platform-neutral :) [So the -l indeed need to be avoided on windows.]
core/zip/src/ZInflate.c
Outdated
| fprintf(stderr, "Error R__unzip_header: error in header\n"); | ||
| return 1; | ||
| !(src[0] == 'C' && src[1] == 'S' && src[2] == Z_DEFLATED) && !(src[0] == 'X' && src[1] == 'Z' && src[2] == 0) && | ||
| !(src[0] == 'L' && src[1] == '4')) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is becoming unwindly, can we create a named repository of those pairs to use here and the 2 or 3 other places.
|
@pcanal - any idea how to debug the When I run |
|
Build failed on slc6/gcc49. Failing tests: |
|
Build failed on centos7/gcc49. Failing tests: |
|
the root-io-geo-make error is due to a missing commit that is (new) on the master. for the other error try: |
|
The ctest failure should be fixed by root-project/roottest#45 |
|
Build failed on ubuntu14/native. Failing tests: |
|
@bbockelm please, try to be explicit in the git log so that later (much later) we can avoid looking at commit (to find a problem) ... "addressing review comment" does not give (our later selves) any clue of what is the nature of the change. [The only exception to that rule are 'white space' only change that have no functional side-effect]. Thanks. |
core/zip/src/ZInflate.c
Outdated
| return src[0] == 'C' && src[1] == 'S' && src[2] == Z_DEFLATED; | ||
| } | ||
|
|
||
| static int is_valid_header_xz(uch *src) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it 'xz' or LZMA?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is technically xz, I believe. We use lzma elsewhere. Will update.
|
|
||
| /* New zlib format */ | ||
| if (src[0] == 'Z' && src[1] == 'L') { | ||
| if (is_valid_header_zlib(src)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is as the side effect of adding one comparison ... which actually lead me to ponder what the semantic of the (src[2] == Z_DEFLATED) test is ... and why/whether there isn't an equivalent for LZ4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's simply the header, as defined by core/zip/src/Bits.h. The header appears to be custom to ROOT. For LZMA and LZ4, we have just used a two-byte header.
| Int_t nentries = (Int_t)tree->GetEntries(); | ||
| nevent = TMath::Min(nevent,nentries); | ||
| if (read == 1) { //read sequential | ||
| ioperf = getenv("ENABLE_TTREEPERFSTATS") ? new TTreePerfStats("Perf Stats", tree) : nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we document this in the header?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the last change need for merging. [Adding support for configure/make so that we can backport to older version can/should be another merge request]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's already taken care of, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad, I had missed it ... but now I also took another look at the commit messages ... for the sake of our future selves can you update the commit message that just say things like "Small fixups from review." into a short description of the functional changes :). Thanks.
| ifeq ($(PLATFORM),win32) | ||
| EVENTLIB = libEvent.lib | ||
| else | ||
| EVENTLIBS = -lTreePlayer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to add
EVENTLIBS = $(ROOTSYS)/lib/libTreePlayer.lib
on the windows side.
| } | ||
|
|
||
| tgt[0] = 'L'; | ||
| tgt[1] = '4'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would have been 'even' better if those had been abstracted out in the same place as the reading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be difficult - the generic header code and the compression-format specific code is interspersed with the actual implementation of the "old-style" compression algorithm.
I would prefer to defer this for follow-up work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. For the record, I would use a (potential) new header defining a set of strings/arrays ....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, that's what I was thinking - new header to define the constants and rationalize the layout of the files (particularly with an eye towards separating the old C code).
| Int_t nentries = (Int_t)tree->GetEntries(); | ||
| nevent = TMath::Min(nevent,nentries); | ||
| if (read == 1) { //read sequential | ||
| ioperf = getenv("ENABLE_TTREEPERFSTATS") ? new TTreePerfStats("Perf Stats", tree) : nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the last change need for merging. [Adding support for configure/make so that we can backport to older version can/should be another merge request]
Update copyright date on files new to this branch. Add comment explaining logic in CMakeFile.txt Mostly whitespaces; of note, the re-definition of R__likely/unlikely has been moved back into RConfig.h.
Add LZ4 support to core library.
|
@pcanal - the GitHub PR page right now is a bit messy, but the rewording of the commit messages you requested is complete. Ready to go? |
|
Yes, thanks! |
|
Could you open a PR to update the configure/make :) ? Thanks. |
|
Hi, the PR broke Arm and fedora builds: could you please check? Cheers, |
|
@dpiparo - is it possible to pull up the logs? Unfortunately, the error message says simply the error message is in a different log file. |
|
Hi, I fixed that since the bug affected my development platform. Now the tarball is fetched from the eos based website from where all other tarballs are downloaded. The reason for the bug was the fact that curl could not use https. |
|
Thanks Danilo! Yes, that was a TODO item to stage the tarball to the official repo -- you just saved us some work. |
This PR adds support for the LZ4 compression algorithm (branch has been significantly cleaned up).
In testing with the sample
Eventobject found in thetestsubdirectory, we found that the resulting files were about 13% larger than the default compression settings. Deserialization speed was 95% of the uncompressed speed.As discussed with @pcanal