You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+11-8Lines changed: 11 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# sqlite_zstd_vfs
2
2
3
-
This [SQLite VFS extension](https://www.sqlite.org/vfs.html) provides streaming storage compression with [Zstandard](https://facebook.github.io/zstd/), compressing [pages of the main database file](https://www.sqlite.org/fileformat.html) as they're written out, and later decompressing them as they're read in. It runs page compression on background threads and occasionally generates [dictionaries](https://github.com/facebook/zstd#the-case-for-small-data-compression) to improve subsequent compression.
3
+
This [SQLite VFS extension](https://www.sqlite.org/vfs.html) provides streaming storage compression with [Zstandard](https://facebook.github.io/zstd/), compressing [pages of the main database file](https://www.sqlite.org/fileformat.html) as they're written out, and later decompressing them as they're read in. It runs page de/compression on background threads and occasionally generates [dictionaries](https://github.com/facebook/zstd#the-case-for-small-data-compression) to improve subsequent compression.
4
4
5
5
Compressed page storage operates similarly to the [design of ZIPVFS](https://sqlite.org/zipvfs/doc/trunk/www/howitworks.wiki), the SQLite developers' proprietary extension. Because we're not as familiar with the internal "pager" module, we use a full-fledged SQLite database as the bottom-most layer. Where SQLite would write database page #P at offset P × page_size in the disk file, instead we `INSERT INTO outer_page_table(rowid,data) VALUES(P,compressed_inner_page)`, and later `SELECT data FROM outer_page_table WHERE rowid=P`. *You mustn't be afraid to dream a little smaller, darling...*
6
6
@@ -95,21 +95,24 @@ Repeat the query to see the update.
95
95
96
96
## Performance
97
97
98
-
Here are some operation timings using a [1,195MiB TPC-H database](https://github.com/lovasoa/TPCH-sqlite) on my laptop. This isn't a thorough benchmark, just a rough indication that many applications should find the decompression overhead acceptable for the storage saved. (Credit to Zstandard!)
98
+
Here are some operation timings using a [1,195MiB TPC-H database](https://github.com/lovasoa/TPCH-sqlite) on a Google N2 VM. This isn't a thorough benchmark, just a rough indication that many applications should find the de/compression overhead well worth the storage saved.
Query 1 is an aggregation satisfied by one table scan. Decompression slows it down by ~25% while the database file shrinks by 50-60%. (Each query starts with a hot filesystem cache and cold database page cache.)
110
+
Query 1 is an aggregation satisfied by one table scan. Foreground Zstandard decompression slows it down by ~25% while the database file shrinks by 45-65%. (Each query starts with a hot filesystem cache and cold database page cache.)
110
111
111
112
Query 8 is an [historically influential](https://www.sqlite.org/queryplanner-ng.html) eight-way join. SQLite's default ~2MB page cache is too small for its access pattern, leading to a disastrous slowdown from repeated page decompression; but simply increasing the page cache to 100MiB largely solves this problem. Evidently, we should prefer a much larger page cache in view of the increased cost to miss.
112
113
114
+
Background threads can be used for both compression and "prefetching" during sequential scans. This greatly improves bulk load speed, and can sometimes fully hide decompression latency (given sequential access patterns, large pages, and spare CPU availability).
115
+
113
116
## Tuning
114
117
115
118
Some parameters are controlled from the file URI's query string opening the database, while others are set later through [PRAGMA statements](https://www.sqlite.org/pragma.html):
@@ -125,7 +128,7 @@ Some parameters are controlled from the file URI's query string opening the data
125
128
***&outer_unsafe=false**: set true to speed up bulk load by disabling transaction safety for outer database (app crash easily causes corruption)
126
129
127
130
***&outer_cache_size=-2000**: page cache size for outer database, in [PRAGMA cache_size](https://www.sqlite.org/pragma.html#pragma_cache_size) units. Limited effect if on SSD.
128
-
***&noprefetch=false**: set true to disable background prefetching/decompression even if threads>1. Prefetching is counterproductive if page size is too small or CPU cycles are scarce.
131
+
***&noprefetch=false**: set true to disable background prefetching/decompression even if threads>1. Prefetching is counterproductive if page_size is too small or CPU cycles are scarce.
129
132
130
133
***PRAGMA page_size=4096**: uncompressed [page size](https://www.sqlite.org/pragma.html#pragma_page_size) for the newly-created inner database. Larger pages are more compressible, but increase [read/write amplification](http://smalldatum.blogspot.com/2015/11/read-write-space-amplification-pick-2_23.html). YMMV but 8 or 16 KiB have been working well for us.
131
134
***PRAGMA auto_vacuum=NONE**: set to FULL or INCREMENTAL on a newly-created database if you expect its size to fluctuate over time, so that the file will [shrink to fit](https://www.sqlite.org/pragma.html#pragma_auto_vacuum). (The outer database auto-vacuums when it's closed.)
0 commit comments