@@ -396,42 +396,23 @@ for a file size:
396396
397397Unfortunately, we're not quite done. The popcount function is non-injective,
398398so we can only find the file size from the block index, not the other way
399- around. However, we can guess and correct. Consider an n' block index that
400- is greater than n, we can find one pretty easily:
399+ around. However, we can solve for an n' block index that is greater than n
400+ with an error bounded by the range of the popcount function. We can then
401+ repeatedly substitute this n' into the original equation until the error
402+ is smaller than the integer division. As it turns out, we only need to
403+ perform this substitution once. Now we directly calculate our block index:
401404
402- ![ summation3step1 ] ( https://latex.codecogs.com/svg.latex?n%27% 20%3D%20%5Cleft%5Clfloor%5Cfrac%7BN%7D%7BB-2%5Cfrac%7Bw%7D%7B8%7D%7D%5Cright%5Crfloor )
405+ ![ formulaforn ] ( https://latex.codecogs.com/svg.latex?n%20%3D%20%5Cleft%5Clfloor%5Cfrac%7BN-%5Cfrac%7Bw%7D%7B8%7D%5Cleft%28%5Ctext%7Bpopcount%7D%5Cleft%28%5Cfrac%7BN%7D%7BB-2%5Cfrac%7Bw%7D%7B8%7D%7D-1%5Cright%29 & plus ; 2%5Cright%29 %7D%7BB-2%5Cfrac%7Bw%7D%7B8%7D%7D%5Cright%5Crfloor )
403406
404- where:
405- n' >= n
406-
407- We can plug n' back into our popcount equation to find an N' file size that
408- is greater than N. However, we need to rearrange our terms a bit to avoid
409- integer overflow:
410-
411- ![ summation3step2] ( https://latex.codecogs.com/svg.latex?N%27%20%3D%20%28B-2%5Cfrac%7Bw%7D%7B8%7D%29n%27&plus ; %5Cfrac%7Bw%7D%7B8%7D%5Ctext%7Bpopcount%7D%28n%27%29 )
412-
413- where:
414- N' >= N
415-
416- Now that we have N', we can find our block offset:
417-
418- ![ summation3step3] ( https://latex.codecogs.com/svg.latex?%5Cmathit%7Boff%7D%27%20%3D%20N%20-%20N%27 )
419-
420- where:
421- off' >= off, our byte offset in the block
422-
423- Now we're getting somewhere. N' is greater than or equal to N, and as long as
424- the number of pointers per block is bounded by the block size, it can only be
425- different by at most one block. So we have two cases that can be determined by
426- the sign of off'. If off' is negative, we correct n' and add a block to off'.
427- Note that we also need to incorporate the overhead of the last block to get
428- the right offset.
407+ Now that we have our block index n, we can just plug it back into the above
408+ equation to find the offset. However, we do need to rearrange the equation
409+ a bit to avoid integer overflow:
429410
430- ![ summation3step4 ] ( https://latex.codecogs.com/svg.latex?n%2C%20% 5Cmathit%7Boff%7D%20%3D%20%5Cbegin%7Bcases%7D%20n%27-1%2C%20%5Cmathit%7Boff% 7D%27 & plus ; B%20%26%20%5Cmathit%7Boff% 7D%27%20%3C%200%20%5C%5C%20n%27%2C% 20%5Cmathit%7Boff%7D%27 & plus ; % 5Cfrac%7Bw%7D%7B8%7D%5Cleft%5B% 5Ctext%7Bctz %7D%28n%27%29 & plus ; 1%5Cright%5D%20%26%20%5Cmathit%7Boff%7D%27%20%5Cgeq%200%20%5Cend%7Bcases%7D )
411+ ![ formulaforoff ] ( https://latex.codecogs.com/svg.latex?% 5Cmathit%7Boff%7D%20%3D%20N%20-%20%5Cleft%28B-2%5Cfrac%7Bw% 7D%7B8% 7D%5Cright%29n%20-% 20%5Cfrac%7Bw%7D%7B8%7D%5Ctext%7Bpopcount %7D%28n%29 )
431412
432- It's a lot of math, but computers are very good at math. With these equations
433- we can solve for the block index + offset while only needed to store the file
434- size in O(1).
413+ The solution involves quite a bit of math, but computers are very good at math.
414+ We can now solve for the block index + offset while only needed to store the
415+ file size in O(1).
435416
436417Here is what it might look like to update a file stored with a CTZ skip-list:
437418```
0 commit comments