-
Notifications
You must be signed in to change notification settings - Fork 322
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
flate: Improve level 1-3 compression (#678)
Use 5 byte hash instead of 4 byte hash. This improves compression in most cases and will also yield faster decompression. Little to no performance impact. Before/after: ``` file out level insize outsize millis nyc-taxi-data-10M.csv gzkp 1 3325605752 922273214 14065 225.49 nyc-taxi-data-10M.csv gzkp 1 3325605752 846471964 14342 221.12 nyc-taxi-data-10M.csv gzkp 2 3325605752 883782053 15683 202.22 nyc-taxi-data-10M.csv gzkp 2 3325605752 815766227 14865 213.35 nyc-taxi-data-10M.csv gzkp 3 3325605752 878726683 17308 183.24 nyc-taxi-data-10M.csv gzkp 3 3325605752 808448239 16882 187.86 nyc-taxi-data-10M.csv gzkp 4 3325605752 789447233 20651 153.57 nyc-taxi-data-10M.csv gzkp 4 3325605752 789447233 20657 153.53 file out level insize outsize millis mb/s enwik9 gzkp 1 1000000000 382781160 5713 166.90 enwik9 gzkp 1 1000000000 374131553 5826 163.69 enwik9 gzkp 2 1000000000 371351753 6131 155.55 enwik9 gzkp 2 1000000000 361881529 5910 161.36 enwik9 gzkp 3 1000000000 364881746 6891 138.39 enwik9 gzkp 3 1000000000 355065173 6960 137.02 enwik9 gzkp 4 1000000000 342732211 8339 114.36 enwik9 gzkp 4 1000000000 342732211 8252 115.57 file reset out level files insize outsize millis mb/s objectfiles true gzkp 1 708 300491980 56114777 1008 284.27 objectfiles true gzkp 1 708 300491980 55300071 998 286.90 objectfiles true gzkp 2 708 300491980 53946448 1147 249.71 objectfiles true gzkp 2 708 300491980 52750260 1109 258.36 objectfiles true gzkp 3 708 300491980 53110452 1220 234.82 objectfiles true gzkp 3 708 300491980 51947585 1211 236.46 One of the few regressions: file out level insize outsize millis mb/s rawstudio-mint14.tar gzkp 1 8558382592 3960117298 36682 222.50 rawstudio-mint14.tar gzkp 1 8558382592 3985295228 36619 222.88 rawstudio-mint14.tar gzkp 2 8558382592 3899597850 38683 210.99 rawstudio-mint14.tar gzkp 2 8558382592 3921716642 36754 222.06 rawstudio-mint14.tar gzkp 3 8558382592 3848762302 46588 175.19 rawstudio-mint14.tar gzkp 3 8558382592 3846475496 45611 178.94 ```
- Loading branch information
Showing
9 changed files
with
100 additions
and
106 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.