Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

s2: Add stream index #462

Merged
merged 14 commits into from
Jan 11, 2022
Prev Previous commit
Update test.
  • Loading branch information
klauspost committed Jan 11, 2022
commit ac533aeca0960e2ec36e31ec86b0db7e80e766e5
14 changes: 7 additions & 7 deletions s2/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -640,12 +640,12 @@ Compression and speed is typically a bit better `MaxEncodedLen` is also smaller
Comparison of [`webdevdata.org-2015-01-07-subset`](https://files.klauspost.com/compress/webdevdata.org-2015-01-07-4GB-subset.7z),
53927 files, total input size: 4,014,735,833 bytes. amd64, single goroutine used:

| Encoder | Size | MB/s | Reduction |
|-----------------------|------------|--------|------------
| snappy.Encode | 1128706759 | 725.59 | 71.89% |
| s2.EncodeSnappy | 1093823291 | 899.16 | 72.75% |
| s2.EncodeSnappyBetter | 1001158548 | 578.49 | 75.06% |
| s2.EncodeSnappyBest | 944507998 | 66.00 | 76.47% |
| Encoder | Size | MB/s | Reduction |
|-----------------------|------------|------------|------------
| snappy.Encode | 1128706759 | 725.59 | 71.89% |
| s2.EncodeSnappy | 1093823291 | **899.16** | 72.75% |
| s2.EncodeSnappyBetter | 1001158548 | 578.49 | 75.06% |
| s2.EncodeSnappyBest | 944507998 | 66.00 | **76.47%**|

## Streams

Expand All @@ -657,7 +657,7 @@ Comparison of different streams, AMD Ryzen 3950x, 16 cores. Size and throughput:
| File | snappy.NewWriter | S2 Snappy | S2 Snappy, Better | S2 Snappy, Best |
|-----------------------------|--------------------------|---------------------------|--------------------------|-------------------------|
| nyc-taxi-data-10M.csv | 1316042016 - 539.47MB/s | 1307003093 - 10132.73MB/s | 1174534014 - 5002.44MB/s | 1115904679 - 177.97MB/s |
| enwik10 (xml) | 5088294643 - 433.45MB/s | 5175840939 - 8454.52MB/s | 4560784526 - 4403.10MB/s | 4340299103 - 159.71MB/s |
| enwik10 (xml) | 5088294643 - 451.13MB/s | 5175840939 - 9440.69MB/s | 4560784526 - 4487.21MB/s | 4340299103 - 158.92MB/s |
| 10gb.tar (mixed) | 6056946612 - 729.73MB/s | 6208571995 - 9978.05MB/s | 5741646126 - 4919.98MB/s | 5548973895 - 180.44MB/s |
| github-june-2days-2019.json | 1525176492 - 933.00MB/s | 1476519054 - 13150.12MB/s | 1400547532 - 5803.40MB/s | 1321887137 - 204.29MB/s |
| consensus.db.10gb (db) | 5412897703 - 1102.14MB/s | 5354073487 - 13562.91MB/s | 5335069899 - 5294.73MB/s | 5201000954 - 175.72MB/s |
Expand Down
5 changes: 4 additions & 1 deletion s2/encode_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -251,7 +251,10 @@ func TestIndex(t *testing.T) {
todo := input
for len(todo) > 0 {
// Write random sized inputs..
x := todo[:rng.Intn((len(todo)&65535)+2)]
x := todo[:rng.Intn(1+len(todo)&65535)]
if len(x) == 0 {
x = todo[:1]
}
_, err := enc.Write(x)
fatalErr(t, err)
// Flush once in a while
Expand Down