Simplify VirtualHasher input iterator #14434
Labels
Platform Data Structures
Platform Virtual Map
Platform
Tickets pertaining to the platform
Tech Debt Reduced
Issues which reduce technical debt.
VirtualHasher
accepts a stream of dirty leaves (virtual leaf records) as its input. The stream must be sorted in virtual node path order, from the first leaf path to the last leaf path. Current hashing algorithm relies on this order.More often than not, the first leaf path and the last leaf path are on different ranks. In this case, last leaf parent is immediately to the left of the first leaf, and a few more paths close to the end of the range are to the left, too. It means, if a tree is traversed left to right, then at some point some leaves with higher paths may be processed before leaves with lower paths. There is a mechanism to work it around in
VirtualHasher
, but it makes hashing code slightly more complicated than it could be.This ticket is to make a change to order in which dirty leaves stream is sorted. For every leaf, its position in the stream should be according to the path at the first leaf rank - either the leaf itself, or its parent. That is, assume the leaf range is [N, 2N], while leaves N to M are on the first leaf rank, and leaves M+1 to 2N are on the last leaf rank. Then the stream should be [M+1, 2N] [N, M]
There are two scenarios where
VirtualHasher
is used, and both should be updated to use the order above. First, hashing a single copy from virtual pipeline. Dirty leaves stream is provided byVirtualNodeCache
. Second, hashing during reconnect, and in this case the stream is provided by a virtual view.The text was updated successfully, but these errors were encountered: