Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
sstable: add value blocks for storing older versions of a key
When WriterOptions.ValueBlocksAreEnabled is set to true, older versions of a key are written to a sequence of value blocks, and the key contains a valueHandle which is a tuple (valueLen, blockNum, offsetInBlock). The assumption here is that most reads only need the value of the latest version, and many reads that care about an older version only need the value length. Value blocks are a simple sequence of (varint encoded length, value bytes) tuples such that given the uncompressed value block, the valueHandle can cheaply read the value. The value blocks index connects the blockNum in the valueHandle to the location of the value block. It uses fixed width encoding to avoid the expense of a general purpose key-value block. See the comment at the top of value_block.go for details. The following are preliminary results from a read benchmark, after some performance tuning. The old numbers are master. The needValue=false cases are the ones where value blocks are expected to help. - The versions=1 have no values in value blocks, and the slowdown is the extra call to valueBlockReader that needs to subslice to remove the single byte prefix. - The hasCache=false case correspond to a cold cache, where there will be additional wasted decompression of values that we don't need (when needValue=false). As expected, when there is an improvement, it is larger with hasCache=false. For example the -97.83% below (almost 50x faster) compared with -79.89%. - The needValue=true is where the code can be slower up to 2x. The higher slowdowns occur when the value size is smaller. In such cases more inline values can be packed into an ssblock and the code overhead of decoding the valueHandle, and the value length in the value block (all of these are varints) becomes a significant component. This is a prototype in that there are no changes to the InternalIterator interface, and the read path only works for singleLevelIterator. name old time/op new time/op delta ValueBlocks/valueSize=100/versions=1/needValue=false/hasCache=false-16 25.5ns ± 3% 25.9ns ± 2% +1.50% (p=0.028 n=10+10) ValueBlocks/valueSize=100/versions=1/needValue=false/hasCache=true-16 15.6ns ± 1% 15.5ns ± 2% ~ (p=0.268 n=9+10) ValueBlocks/valueSize=100/versions=1/needValue=true/hasCache=false-16 27.3ns ± 3% 29.5ns ± 3% +8.11% (p=0.000 n=10+10) ValueBlocks/valueSize=100/versions=1/needValue=true/hasCache=true-16 17.1ns ± 2% 19.2ns ± 2% +12.74% (p=0.000 n=10+10) ValueBlocks/valueSize=100/versions=10/needValue=false/hasCache=false-16 26.7ns ± 2% 29.4ns ± 2% +10.46% (p=0.000 n=9+10) ValueBlocks/valueSize=100/versions=10/needValue=false/hasCache=true-16 15.9ns ± 2% 15.2ns ± 3% -4.63% (p=0.000 n=9+10) ValueBlocks/valueSize=100/versions=10/needValue=true/hasCache=false-16 26.7ns ± 2% 53.0ns ± 4% +98.79% (p=0.000 n=9+10) ValueBlocks/valueSize=100/versions=10/needValue=true/hasCache=true-16 16.6ns ± 1% 26.7ns ± 2% +61.05% (p=0.000 n=9+9) ValueBlocks/valueSize=100/versions=100/needValue=false/hasCache=false-16 28.3ns ± 4% 25.3ns ± 5% -10.74% (p=0.000 n=10+10) ValueBlocks/valueSize=100/versions=100/needValue=false/hasCache=true-16 15.8ns ± 2% 14.9ns ± 1% -5.66% (p=0.000 n=10+10) ValueBlocks/valueSize=100/versions=100/needValue=true/hasCache=false-16 29.4ns ± 4% 47.8ns ± 3% +62.46% (p=0.000 n=10+10) ValueBlocks/valueSize=100/versions=100/needValue=true/hasCache=true-16 16.7ns ± 4% 26.1ns ± 3% +56.04% (p=0.000 n=10+10) ValueBlocks/valueSize=1000/versions=1/needValue=false/hasCache=false-16 123ns ± 4% 125ns ± 7% ~ (p=0.735 n=9+10) ValueBlocks/valueSize=1000/versions=1/needValue=false/hasCache=true-16 23.0ns ± 5% 22.9ns ± 5% ~ (p=0.684 n=10+10) ValueBlocks/valueSize=1000/versions=1/needValue=true/hasCache=false-16 124ns ± 6% 131ns ± 7% +5.76% (p=0.008 n=9+10) ValueBlocks/valueSize=1000/versions=1/needValue=true/hasCache=true-16 24.3ns ± 4% 26.4ns ± 3% +8.26% (p=0.000 n=10+10) ValueBlocks/valueSize=1000/versions=10/needValue=false/hasCache=false-16 130ns ± 8% 27ns ± 4% -79.10% (p=0.000 n=10+10) ValueBlocks/valueSize=1000/versions=10/needValue=false/hasCache=true-16 23.8ns ± 4% 16.6ns ± 2% -30.00% (p=0.000 n=10+10) ValueBlocks/valueSize=1000/versions=10/needValue=true/hasCache=false-16 128ns ± 9% 164ns ±12% +27.94% (p=0.000 n=10+10) ValueBlocks/valueSize=1000/versions=10/needValue=true/hasCache=true-16 25.0ns ± 4% 33.0ns ± 2% +32.22% (p=0.000 n=10+10) ValueBlocks/valueSize=1000/versions=100/needValue=false/hasCache=false-16 123ns ± 9% 28ns ± 3% -76.89% (p=0.000 n=9+10) ValueBlocks/valueSize=1000/versions=100/needValue=false/hasCache=true-16 23.0ns ± 2% 15.3ns ± 5% -33.36% (p=0.000 n=10+9) ValueBlocks/valueSize=1000/versions=100/needValue=true/hasCache=false-16 132ns ± 2% 171ns ± 5% +29.24% (p=0.000 n=8+10) ValueBlocks/valueSize=1000/versions=100/needValue=true/hasCache=true-16 24.3ns ± 3% 32.6ns ± 3% +33.98% (p=0.000 n=10+10) ValueBlocks/valueSize=10000/versions=1/needValue=false/hasCache=false-16 1.45µs ± 8% 1.35µs ±10% -6.41% (p=0.015 n=10+10) ValueBlocks/valueSize=10000/versions=1/needValue=false/hasCache=true-16 75.5ns ± 2% 76.7ns ± 5% ~ (p=0.218 n=10+10) ValueBlocks/valueSize=10000/versions=1/needValue=true/hasCache=false-16 1.34µs ± 3% 1.46µs ±16% +9.03% (p=0.022 n=9+10) ValueBlocks/valueSize=10000/versions=1/needValue=true/hasCache=true-16 77.0ns ± 3% 79.9ns ± 3% +3.80% (p=0.000 n=9+10) ValueBlocks/valueSize=10000/versions=10/needValue=false/hasCache=false-16 1.46µs ± 6% 0.13µs ± 3% -91.15% (p=0.000 n=9+9) ValueBlocks/valueSize=10000/versions=10/needValue=false/hasCache=true-16 76.4ns ± 3% 21.4ns ± 2% -72.06% (p=0.000 n=10+10) ValueBlocks/valueSize=10000/versions=10/needValue=true/hasCache=false-16 1.47µs ± 8% 1.56µs ± 7% +5.72% (p=0.013 n=9+10) ValueBlocks/valueSize=10000/versions=10/needValue=true/hasCache=true-16 78.1ns ± 4% 76.1ns ± 2% -2.52% (p=0.009 n=10+10) ValueBlocks/valueSize=10000/versions=100/needValue=false/hasCache=false-16 1.34µs ± 5% 0.03µs ± 2% -97.83% (p=0.000 n=9+10) ValueBlocks/valueSize=10000/versions=100/needValue=false/hasCache=true-16 77.0ns ± 2% 15.5ns ± 2% -79.89% (p=0.000 n=8+10) ValueBlocks/valueSize=10000/versions=100/needValue=true/hasCache=false-16 1.42µs ± 9% 1.49µs ± 2% +5.28% (p=0.007 n=10+9) ValueBlocks/valueSize=10000/versions=100/needValue=true/hasCache=true-16 78.5ns ± 4% 73.0ns ± 4% -7.01% (p=0.000 n=10+9) Informs cockroachdb#1170
- Loading branch information