Add safe iterator tracking in hashtable to prevents invalid access on hashtable delete#2807
Conversation
…able deleted from under safe iterator Signed-off-by: Rain Valentine <rsg000@gmail.com>
d67a6b7 to
56db3fd
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## unstable #2807 +/- ##
============================================
+ Coverage 72.43% 72.45% +0.02%
============================================
Files 128 129 +1
Lines 70245 70522 +277
============================================
+ Hits 50880 51098 +218
- Misses 19365 19424 +59
🚀 New features to boost your workflow:
|
Signed-off-by: Rain Valentine <rsg000@gmail.com>
JimB123
left a comment
There was a problem hiding this comment.
Glad to see a proposal for this.
src/hashtable.c
Outdated
| iter *it = ht->safe_iterators; | ||
| while (it) { | ||
| it->hashtable = NULL; /* Mark as invalid */ | ||
| it = it->next_safe_iter; | ||
| } | ||
| ht->safe_iterators = NULL; |
There was a problem hiding this comment.
How about:
while (ht->safe_iterators) untrackSafeIterator(ht->safe_iterators);Ensures that the cleanup/invalidation is identical (like setting next_safe_iter to NULL, which was missed here).
There was a problem hiding this comment.
Oh that's much better. Very DRY
| iter->index = -1; | ||
| iter->flags = flags; | ||
| iter->next_safe_iter = NULL; | ||
| if (isSafe(iter) && ht != NULL) { |
There was a problem hiding this comment.
How can ht be NULL here? Maybe put an assertion at the top if you want to check this.
There was a problem hiding this comment.
kvstore's iterator contains a hashtableIterator, and in kvstoreIteratorInit() they want to init their hashtableIterator to something and redo it later when they decide which hashtable to actually start with. Or that was my impression. I discovered this from a failed tcl test 😅
There was a problem hiding this comment.
Hmmm, that sounds logically wrong to me. The hashtable is defining the hashtable-iterator API - not kvstore. It makes no logical sense for hashtable to support creation of an iterator with a null hashtable.
src/hashtable.c
Outdated
| @@ -2000,10 +2048,13 @@ void hashtableReinitIterator(hashtableIterator *iterator, hashtable *ht) { | |||
| /* Resets a stack-allocated iterator. */ | |||
| void hashtableResetIterator(hashtableIterator *iterator) { | |||
There was a problem hiding this comment.
I'm not clear on what this function is trying to do. I see that it can call resumeRehashing, but is that the extent of "reset"? I'm expecting that this would essentially rewind the iterator, allowing me to begin iteration again. However, it's not resetting index or table. Is this right?
Also, how is this different than hashtableReinitIterator (above)? It looks wrong to me that the reinit function above isn't checking if the iterator needs to be untracked - it just NULLs out next_safe_iterator. This looks like an issue.
There was a problem hiding this comment.
Good catch! I've fixed that, and I'm also renaming (which accounts for all the extra files changed):
- hashtableReinitIterator -> hashtableRetargetIterator
- hashtableResetIterator -> hashtableCleanupIterator
Signed-off-by: Rain Valentine <rsg000@gmail.com>
| ht->safe_iterators = it->next_safe_iter; | ||
| } else { | ||
| iter *current = ht->safe_iterators; | ||
| assert(current != NULL); |
There was a problem hiding this comment.
Not really needed. Has limited documentation value, and code will NPE in any case.
| assert(current != NULL); | ||
| while (current->next_safe_iter != it) { | ||
| current = current->next_safe_iter; | ||
| assert(current != NULL); |
There was a problem hiding this comment.
Not really needed. Has limited documentation value, and code will NPE in any case.
zuiderkwast
left a comment
There was a problem hiding this comment.
LGTM, except a trailing slash.
Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
… hashtable delete (valkey-io#2807) This makes it safe to delete hashtable while a safe iterator is iterating it. This currently isn't possible, but this improvement is required for fork-less replication valkey-io#1754 which is being actively worked on. We discussed these issues in valkey-io#2611 which guards against a different but related issue: calling hashtableNext again after it has already returned false. I implemented a singly linked list that hashtable uses to track its current safe iterators. It is used to invalidate all associated safe iterators when the hashtable is released. A singly linked list is acceptable because the list length is always very small - typically zero and no more than a handful. Also, renames the internal functions: hashtableReinitIterator -> hashtableRetargetIterator hashtableResetIterator -> hashtableCleanupIterator --------- Signed-off-by: Rain Valentine <rsg000@gmail.com> Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech> Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
… hashtable delete (valkey-io#2807) This makes it safe to delete hashtable while a safe iterator is iterating it. This currently isn't possible, but this improvement is required for fork-less replication valkey-io#1754 which is being actively worked on. We discussed these issues in valkey-io#2611 which guards against a different but related issue: calling hashtableNext again after it has already returned false. I implemented a singly linked list that hashtable uses to track its current safe iterators. It is used to invalidate all associated safe iterators when the hashtable is released. A singly linked list is acceptable because the list length is always very small - typically zero and no more than a handful. Also, renames the internal functions: hashtableReinitIterator -> hashtableRetargetIterator hashtableResetIterator -> hashtableCleanupIterator --------- Signed-off-by: Rain Valentine <rsg000@gmail.com> Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech> Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
There was a problem hiding this comment.
This binary file was added to the repo in this PR. We need to remove it.
To avoid this in the future, NEVER use git add -a.
Introduced by mistake in #2807. Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
in #2807 we added safe-iterator invalidation when a hashtable is deleted. When we create a safeIterator we initialize it to it->table=0 and it-index=-1. HOWEVER say someone creates a safe iterator never progress it and later on just delete the iterator using `hashtableReleaseIterator`. the iterator will NOT be untracked since, `hashtableCleanupIterator` will skip the untrack: ```c if (!(iter->index == -1 && iter->table == 0)) { if (isSafe(iter)) { hashtableResumeRehashing(iter->hashtable); untrackSafeIterator(iter); } else { assert(iter->fingerprint == hashtableFingerprint(iter->hashtable)); } } ``` and then when the hashtable is freed the iterator will be referenced leading to a use-after-free. For example: ```c it = hashtableCreateIterator(ht, HASHTABLE_ITER_SAFE); hashtableReleaseIterator(it); hashtableRelease(ht); ``` This PR fix it by ALWAYS untrack safe iterators which have a valid hashtable when they are being cleaned up. Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
Introduced by mistake in valkey-io#2807. Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
in valkey-io#2807 we added safe-iterator invalidation when a hashtable is deleted. When we create a safeIterator we initialize it to it->table=0 and it-index=-1. HOWEVER say someone creates a safe iterator never progress it and later on just delete the iterator using `hashtableReleaseIterator`. the iterator will NOT be untracked since, `hashtableCleanupIterator` will skip the untrack: ```c if (!(iter->index == -1 && iter->table == 0)) { if (isSafe(iter)) { hashtableResumeRehashing(iter->hashtable); untrackSafeIterator(iter); } else { assert(iter->fingerprint == hashtableFingerprint(iter->hashtable)); } } ``` and then when the hashtable is freed the iterator will be referenced leading to a use-after-free. For example: ```c it = hashtableCreateIterator(ht, HASHTABLE_ITER_SAFE); hashtableReleaseIterator(it); hashtableRelease(ht); ``` This PR fix it by ALWAYS untrack safe iterators which have a valid hashtable when they are being cleaned up. Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
This makes it safe to delete hashtable while a safe iterator is iterating it. This currently isn't possible, but this improvement is required for fork-less replication #1754 which @JimB123 is actively working on.
We discussed these issues in #2611 which guards against a different but related issue: calling hashtableNext again after it has already returned false.
I implemented a singly linked list that hashtable uses to track its current safe iterators. It is used to invalidate all associated safe iterators when the hashtable is released. A singly linked list is acceptable because the list length is always very small - typically zero and no more than a handful.
Also, renames the internal functions: