Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deflate: Better Huffman encoding #374

Merged
merged 3 commits into from
May 19, 2021
Merged

Conversation

klauspost
Copy link
Owner

@klauspost klauspost commented May 18, 2021

Speed up and improve huffman compression:

Lower levels also slightly affected.

λ benchcmp before.txt after.txt                                              
benchmark                               old ns/op     new ns/op     delta    
BenchmarkEncodeDigitsConstant1e4-32     32925         20581         -37.49%  
BenchmarkEncodeDigitsConstant1e5-32     425414        190512        -55.22%  
BenchmarkEncodeDigitsConstant1e6-32     4261446       1854389       -56.48%  
BenchmarkEncodeDigitsSpeed1e4-32        66777         57418         -14.02%  
BenchmarkEncodeDigitsSpeed1e5-32        855737        780709        -8.77%   
BenchmarkEncodeDigitsSpeed1e6-32        8584307       7212569       -15.98%  
BenchmarkEncodeDigitsDefault1e4-32      124753        123488        -1.01%   
BenchmarkEncodeDigitsDefault1e5-32      1536784       1496156       -2.64%   
BenchmarkEncodeDigitsDefault1e6-32      15765790      14603287      -7.37%   
BenchmarkEncodeDigitsCompress1e4-32     185589        180536        -2.72%   
BenchmarkEncodeDigitsCompress1e5-32     3264706       3291944       +0.83%   
BenchmarkEncodeDigitsCompress1e6-32     35219900      35345634      +0.36%   
BenchmarkEncodeDigitsSL1e4-32           59526         53537         -10.06%  
BenchmarkEncodeDigitsSL1e5-32           916883        879390        -4.09%   
BenchmarkEncodeDigitsSL1e6-32           9180701       8746477       -4.73%   
BenchmarkEncodeTwainConstant1e4-32      41059         29142         -29.02%  
BenchmarkEncodeTwainConstant1e5-32      486514        247555        -49.12%  
BenchmarkEncodeTwainConstant1e6-32      3938046       2531794       -35.71%  
BenchmarkEncodeTwainSpeed1e4-32         87027         81743         -6.07%   
BenchmarkEncodeTwainSpeed1e5-32         851805        784411        -7.91%   
BenchmarkEncodeTwainSpeed1e6-32         7885728       7330058       -7.05%   
BenchmarkEncodeTwainDefault1e4-32       126807        124323        -1.96%   
BenchmarkEncodeTwainDefault1e5-32       1371597       1338383       -2.42%   
BenchmarkEncodeTwainDefault1e6-32       13067533      12956724      -0.85%   
BenchmarkEncodeTwainCompress1e4-32      237083        233995        -1.30%   
BenchmarkEncodeTwainCompress1e5-32      4430928       4414564       -0.37%   
BenchmarkEncodeTwainCompress1e6-32      48377762      48591873      +0.44%   
BenchmarkEncodeTwainSL1e4-32            80816         77373         -4.26%   
BenchmarkEncodeTwainSL1e5-32            889941        842541        -5.33%   
BenchmarkEncodeTwainSL1e6-32            8740752       8199248       -6.20%   

Sizes:

nyc-taxi-data-10M.csv: 1877917504 -> 1872115496 bytes
consensus.db.10gb: 7774876313 -> 7769853327 bytes
github-june-2days-2019.json: 4104164737 -> 4097019597 bytes.

Speed up and improve huffman compression:

```
λ benchcmp before.txt after.txt
benchmark                               old ns/op     new ns/op     delta
BenchmarkEncodeDigitsConstant1e4-32     32925         20138         -38.84%
BenchmarkEncodeDigitsConstant1e5-32     425414        218386        -48.67%
BenchmarkEncodeDigitsConstant1e6-32     4261446       1866023       -56.21%
BenchmarkEncodeDigitsSpeed1e4-32        66777         60683         -9.13%
BenchmarkEncodeDigitsSpeed1e5-32        855737        807328        -5.66%
BenchmarkEncodeDigitsSpeed1e6-32        8584307       7505546       -12.57%
BenchmarkEncodeDigitsDefault1e4-32      124753        123101        -1.32%
BenchmarkEncodeDigitsDefault1e5-32      1536784       1507136       -1.93%
BenchmarkEncodeDigitsDefault1e6-32      15765790      14838850      -5.88%
BenchmarkEncodeDigitsCompress1e4-32     185589        186598        +0.54%
BenchmarkEncodeDigitsCompress1e5-32     3264706       3277041       +0.38%
BenchmarkEncodeDigitsCompress1e6-32     35219900      35308128      +0.25%
BenchmarkEncodeDigitsSL1e4-32           59526         54858         -7.84%
BenchmarkEncodeDigitsSL1e5-32           916883        896292        -2.25%
BenchmarkEncodeDigitsSL1e6-32           9180701       8873708       -3.34%
BenchmarkEncodeTwainConstant1e4-32      41059         29454         -28.26%
BenchmarkEncodeTwainConstant1e5-32      486514        248799        -48.86%
BenchmarkEncodeTwainConstant1e6-32      3938046       2547548       -35.31%
BenchmarkEncodeTwainSpeed1e4-32         87027         82783         -4.88%
BenchmarkEncodeTwainSpeed1e5-32         851805        803264        -5.70%
BenchmarkEncodeTwainSpeed1e6-32         7885728       7452326       -5.50%
BenchmarkEncodeTwainDefault1e4-32       126807        126695        -0.09%
BenchmarkEncodeTwainDefault1e5-32       1371597       1373745       +0.16%
BenchmarkEncodeTwainDefault1e6-32       13067533      13027351      -0.31%
BenchmarkEncodeTwainCompress1e4-32      237083        234776        -0.97%
BenchmarkEncodeTwainCompress1e5-32      4430928       4396044       -0.79%
BenchmarkEncodeTwainCompress1e6-32      48377762      48015133      -0.75%
BenchmarkEncodeTwainSL1e4-32            80816         81162         +0.43%
BenchmarkEncodeTwainSL1e5-32            889941        868247        -2.44%
BenchmarkEncodeTwainSL1e6-32            8740752       8356943       -4.39%
```
@klauspost klauspost merged commit 6274b7e into master May 19, 2021
@klauspost klauspost deleted the flate-improve-huffman-encoding branch May 19, 2021 07:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant