Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
fix a mismanagement of the hash table that could lead to data loss
This commit fixes a data loss bug in our hopscotch table implementation. Removing values from the table can result in other values becoming disconnected and lost. Let A, B, and C be values that all hash to cell 0. Assume the hopscotch distance factor H = 2. 0 1 2 +-----+-----+-----+ | | | | +-----+-----+-----+ After adding A 0 1 2 +-----+-----+-----+ | A | | | +-----+-----+-----+ | +-Neighbors = After adding B 0 1 2 +-----+-----+-----+ | A | B | | +-----+-----+-----+ | +-Neighbors = 1 After adding C 0 1 2 +-----+-----+-----+ | A | B | C | +-----+-----+-----+ | +-Neighbors = 1, 2 If we then remove B, 0 1 2 +-----+-----+-----+ | A | [X] | C | +-----+-----+-----+ | +-Neighbors = 1, 2 * It is replaced with a placeholder [X]. * A's neighbor table is not updated to reflect the loss. If we then remove A, 0 1 2 +-----+-----+-----+ | [X] | [X] | [C] | +-----+-----+-----+ | +-Neighbors = 2 * The table is rebalanced to promote A's lowest neighbor to the primary cell position. * C from cell 2 remains cell 0's neighbor. The bug manifests if [X] the placeholder value passes the null check set out in MAP_TABLE_VALUE_NULL; that is, the placeholder is "effectively null". Looking up the key that matches C will first evaluate its base cell, the one that collided with the key in the first place. Since that is placeholder [X], and [X] is "effectively null", the lookup stops. C is never retrieved from the hash table. --- The expedient solution to this bug is to update cell 0's neighbors when B is first removed, effectively skipping the hole: If we remove B as above, 0 1 2 +-----+-----+-----+ | A | [X] | C | +-----+-----+-----+ | +-Neighbors = 2 <<< HERE but clear the neighbor bit for cell 1, the promotion that happens when A is later removed places C in cell 0. 0 1 2 +-----+-----+-----+ | C | [X] | [X] | +-----+-----+-----+ | +-Neighbors =
- Loading branch information