-
Notifications
You must be signed in to change notification settings - Fork 287
Prepare findEviction to be extended for multiple memory tiers #166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Remove the item from mmContainer and drop the lock before attempting eviction.
to avoid races with remove/replace with predicate. Also, refact tryEvictRegularItem and tryEvictChainedItem into a single function.
moving bit is used to give exclusive right to evict the item to a particular thread. Originially, there was an assumption that whoever marked the item as moving will try to free it until he succeeds. Since we don't want to do that in findEviction (potentially can take a long time) we need to make sure that unmarking is safe. This patch checks the flags after unmarking (atomically) and if ref is zero it also recyles the item. This is needed as there might be some concurrent thread releasing the item (and decrementing ref count). If moving bit is set, that thread would not free the memory back to allocator, resulting in memory leak on unmarkMoving().
under MMContainer combined_lock.
Checking token validity should not be needed right after creating it.
in findEviction, and rely only of recount being zero when releasing items back to allocator.
@therealgymmy has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The performance looks good. No noticeable diff in A/B tests. I left comments on the correctness of a couple of places.
if (!itr) { | ||
itr.resetToBegin(); | ||
} | ||
itr.resetToBegin(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should keep the original logic. With this change, we can still get here by skipping an item due to refcount > 0 when checking the hash table, and we should continue forward.
// If we destroyed the itr to possibly evict and failed, we restart
// from the beginning again
if (!itr) {
itr.resetToBegin();
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the version that's imported into Meta this is still not fixed and caused problems in production.
We would need to revert this and re-import it. We will try our best to prioritize the re-import. @igchor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Sorry about this, I think something went wrong when I did rebase to the latest main.
@@ -2468,7 +2340,7 @@ void CacheAllocator<CacheTrait>::releaseSlabImpl( | |||
// Need to mark an item for release before proceeding | |||
// If we can't mark as moving, it means the item is already freed | |||
const bool isAlreadyFreed = | |||
!markMovingForSlabRelease(releaseContext, alloc, throttler); | |||
!markExclusiveForSlabRelease(releaseContext, alloc, throttler); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is no longer safe. The original intent is that only a freed item cannot be marked as moving and no allocations can allocate a freed item once that slab was set to be released.
With this change, we can get false here if the item was already marked as exclusive (e.g. from an eviction thread). This means here we would treat that item as one already freed, which isn't the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this cannot happen. If we fail to mark the item as exclusive inside processAllocForRelease
we will just spin on the while(true)
inside markExclusiveForSlabRelease
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^ yup you're right.
// and the item is not expired. | ||
Item& item = *itr; | ||
const bool evictToNvmCache = shouldWriteToNvmCache(item); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still need this check that returns early if token is invalid. Otherwise we trigger the assert at L2677.
That said, we could just skip evicting to NvmCache on eviction if we couldn't acquire a valid PutToken (which can happen due to contention on the in-flight-puts queues). It may cause impact on hit rate. So it's best done as a separate change. We could make this particular change internally (easier to test performance as well).
// record the in-flight eviciton. If not, we move on to next item to avoid
// stalling eviction.
if (evictToNvmCache && !token.isValid()) {
++itr;
stats_.evictFailConcurrentFill.inc();
return WriteHandle{};
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, fixed.
XDCHECK(evictHandle->getRefCount() == 1); | ||
|
||
if (evictToNvmCache && shouldWriteToNvmCacheExclusive(item)) { | ||
XDCHECK(token.isValid()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This failed our internal tests. You can repro this by running this cachebench test config cachelib/cachebench/test_configs/consistency/navy.json
. The issue is related to comments at L2644.
F1013 14:57:11.677257 633 CacheAllocator-inl.h:2677] Check failed: token.isValid()
*** Aborted at 1665698231 (Unix time, try 'date -d @1665698231') ***
*** Signal 6 (SIGABRT) (0x6300000161) received by PID 353 (pthread TID 0x7f2645c00640) (linux TID 633) (maybe from PID 353, UID 99) (code: -6), stack trace: ***
@ 000000000698b4be folly::symbolizer::(anonymous namespace)::signalHandler(int, siginfo_t*, void*)
./folly/experimental/symbolizer/SignalHandler.cpp:449
@ 0000000000000000 (unknown)
@ 000000000009c9d3 __GI___pthread_kill
@ 00000000000444ec __GI_raise
@ 000000000002c432 __GI_abort
@ 00000000069ac304 folly::LogCategory::admitMessage(folly::LogMessage const&) const
./folly/logging/LogCategory.cpp:71
@ 00000000069aa42d folly::LogStreamProcessor::logNow()
./folly/logging/LogStreamProcessor.cpp:190
@ 00000000069aa91c folly::LogStreamVoidify<true>::operator&(std::ostream&)
./folly/logging/LogStreamProcessor.cpp:222
@ 0000000005c608b6 facebook::cachelib::CacheAllocator<facebook::cachelib::LruCacheTrait>::advanceIteratorAndTryEvictRegularItem(facebook::cachelib::MMLru::Container<facebook::cachelib::CacheItem<facebook::cachelib::LruCacheTrait>, &facebook::cachelib::CacheItem<facebook::cachelib::LruCacheTrait>::mmHook_>&, facebook::cachelib::MMLru::Container<facebook::cachelib::CacheItem<facebook::cachelib::LruCacheTrait>, &facebook::cachelib::CacheItem<facebook::cachelib::LruCacheTrait>::mmHook_>::Iterator&)
./cachelib/allocator/CacheAllocator-inl.h:2677
-> ./cachelib/allocator/CacheAllocator.cpp
@ 0000000005c5810d facebook::cachelib::CacheAllocator<facebook::cachelib::LruCacheTrait>::findEviction(signed char, signed char)
./cachelib/allocator/CacheAllocator-inl.h:1251
-> ./cachelib/allocator/CacheAllocator.cpp
@ 0000000005c5897c facebook::cachelib::CacheAllocator<facebook::cachelib::LruCacheTrait>::allocateChainedItemInternal(facebook::cachelib::detail::ReadHandleImpl<facebook::cachelib::CacheItem<facebook::cachelib::LruCacheTrait> > const&, unsigned int)
./cachelib/allocator/CacheAllocator-inl.h:409
-> ./cachelib/allocator/CacheAllocator.cpp
@ 0000000005c8bee0 facebook::cachelib::CacheAllocator<facebook::cachelib::LruCacheTrait>::allocateChainedItem(facebook::cachelib::detail::ReadHandleImpl<facebook::cachelib::CacheItem<facebook::cachelib::LruCacheTrait> > const&, unsigned int)
./cachelib/allocator/CacheAllocator-inl.h:381
-> ./cachelib/allocator/CacheAllocator.cpp
@ 00000000059c0e0a facebook::cachelib::cachebench::CacheStressor<facebook::cachelib::CacheAllocator<facebook::cachelib::LruCacheTrait> >::stressByDiscreteDistribution(facebook::cachelib::cachebench::ThroughputStats&)
./cachelib/cachebench/cache/Cache-inl.h:355
-> ./cachelib/cachebench/runner/Stressor.cpp
@ 00000000000df4e4 execute_native_thread_routine
@ 000000000009ac0e start_thread
@ 000000000012d1db __clone3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
auto& parent = candidate->getParentItem(compressor_); | ||
|
||
const bool evictToNvmCache = shouldWriteToNvmCache(parent); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here we should return early if token isn't valid. Otherwise we can trigger the assertion at L2734.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
@@ -2940,7 +2916,7 @@ bool CacheAllocator<CacheTrait>::markMovingForSlabRelease( | |||
// Since this callback is executed, the item is not yet freed | |||
itemFreed = false; | |||
Item* item = static_cast<Item*>(memory); | |||
if (item->markMoving()) { | |||
if (item->markExclusive()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This now returns false if the item was already marked as exclusive. This causes issue in slab release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case it returns false, we will just spin in the while loop until the item is either freed or we succeed in marking it as exclusive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah you're right. This will just repeat in the loop. I noticed one more thing L2956 that I think we should remove the assert there.
cachelib/allocator/CacheItem.h
Outdated
* | ||
* Unmarking moving does not depend on `isInMMContainer` | ||
* Unmarking exclusive does not depend on `isInMMContainer` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add these to the comments
Unmarking exclusive will also return the refcount at the moment of unmarking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
cachelib/allocator/Refcount.h
Outdated
* | ||
* Unmarking moving does not depend on `isInMMContainer` | ||
* Unmarking exclusive does not depend on `isInMMContainer` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. Can you add these to the comments.
Unmarking exclusive will also return the refcount at the moment of unmarking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
// among all the control bits is set. This indicates an item is already on | ||
// its way out of cache and does not need to be moved. | ||
bool isOnlyExclusive() const noexcept { | ||
// An item is only exclusive when its refcount is zero and only the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// An item is only exclusive when its refcount is zero and only the
// exclusive bit among all the control bits is set. This indicates an item
// is exclusive to the current thread. No other thread is allowed to
// do anything with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
ce5a74a
to
549108b
Compare
@igchor has updated the pull request. You must reimport the pull request before landing. |
By the way, is there any fundamental reason for NvmCache::put to accept a handle instead of a raw pointer? I'm working on the next set of changes (based on this PR) and passing a raw ptr would be much simpler |
@therealgymmy has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@@ -2962,7 +2953,7 @@ bool CacheAllocator<CacheTrait>::markMovingForSlabRelease( | |||
itemFreed = true; | |||
|
|||
if (shutDownInProgress_) { | |||
XDCHECK(!static_cast<Item*>(alloc)->isMoving()); | |||
XDCHECK(!static_cast<Item*>(alloc)->isExclusive()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we remove this assert? I think this can be true now if an item is marked as exclusive by eviction thread while the slab release gets to this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right. Removed.
@@ -2940,7 +2916,7 @@ bool CacheAllocator<CacheTrait>::markMovingForSlabRelease( | |||
// Since this callback is executed, the item is not yet freed | |||
itemFreed = false; | |||
Item* item = static_cast<Item*>(memory); | |||
if (item->markMoving()) { | |||
if (item->markExclusive()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah you're right. This will just repeat in the loop. I noticed one more thing L2956 that I think we should remove the assert there.
@@ -2468,7 +2340,7 @@ void CacheAllocator<CacheTrait>::releaseSlabImpl( | |||
// Need to mark an item for release before proceeding | |||
// If we can't mark as moving, it means the item is already freed | |||
const bool isAlreadyFreed = | |||
!markMovingForSlabRelease(releaseContext, alloc, throttler); | |||
!markExclusiveForSlabRelease(releaseContext, alloc, throttler); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^ yup you're right.
By raw pointer do you mean There is no strong reason to use WriteHandle&. Mostly the code was just written this way. |
549108b
to
c5e6f2a
Compare
@igchor has updated the pull request. You must reimport the pull request before landing. |
Ok, I overstated it a bit :) It will make it slightly easier because in the patch, in some places we do not acquire a handle to the item anymore (.e.g. in findEviction we only operate on pointers/references). Once this PR is merged I'll rebase my second patch and push it so you can take a look. |
@therealgymmy has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This pull request has been reverted by 928114c. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @igchor. This change caused an issue in production and we reverted it.
Once you get a chance to fix the comments, I'll import it again and run tests against the service it failed on.
|
||
// if token is invalid, return. iterator is already advanced. | ||
if (evictToNvmCache && !token.isValid()) { | ||
++itr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already advanced the iterator at line 2700 so we shouldn't do it again here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
if (toReleaseHandle) { | ||
if (toReleaseHandle->hasChainedItem()) { | ||
const auto ref = candidate->unmarkExclusive(); | ||
if (ref == 0u) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we still need to check if toReleaseHandle
is valid. Not because we want to make sure its not being evicted by another thread, but to know whether we did actually removed candidate
from the advanceIteratorAndTryEvict*Item
function call.
If an item was not removed, we should skip the logic inside the if-else block and simply unmarkExclusive and move on. If we don't want to return toReleaseHandle because we don't want to increment the refcount, we should find another way to check if the iterator was advanced or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it matter whether we did actually remove the candidate
? If it was some other thread that did this and the ref is 0
after unmarkExclsuive
we should still be able to recycle the item. unmarkExclusive
in this case is like decRef()
. For ref == 0
we are the last owner but instead of returning the item to allocator we reuse it.
I even belive that we have to recycle the item (or release it back to allocator) in this case. Otherwise we could have a memory leak.
Is my thinking correct or do I miss something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are right. unmarkExclusive
has both admin and access refcount right? if so, you are correct that we would have to recycle as this is the last owner.
I think there is just one small problem now:
C1: removed item and ref == 0 (no problem. The golden path)
C2: removed item and ref != 0 (shouldn't happen. No access ref can be created after removed from AccessContainer)
C3: not removed and ref == 0 (no problem. If happened, should still recycle because another thread had removed the item from Access and MMContainer, but didn't recycle because current thread is holding exclusive bit)
C4: not removed and ref > 0. In this case, we would increment the eviction failure counters again in the else block, while we already incremented them inside advanceIteratorAndTryEvict
.
I think we just need to fix C4 here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, ref
will contain both access and admin bits.
You are also right about those 4 cases. Since C2 cannot happen, all the stats are always updated inside advanceIteratorAndTryEvict*
. So, I just removed the code which updates stats outside of those functions and added an assert
for C2.
if (!itr) { | ||
itr.resetToBegin(); | ||
} | ||
itr.resetToBegin(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be fixed.
c5e6f2a
to
707cf1f
Compare
@igchor has updated the pull request. You must reimport the pull request before landing. |
attempts. Mark the item as exclusive before trying to remove from AC or MMContainer.
707cf1f
to
5b6d18c
Compare
@igchor has updated the pull request. You must reimport the pull request before landing. |
@haowu14 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@igchor Update: we will be testing this in a prod-like setup in the week of Nov 28. |
Tested on Monday and landed just now. I'll move on to the next PR. |
This PR refactors the eviction logic (inside
findEviction
) so that it will be easy to add support for multiple memory tiers. Problem with multi-tier configuration is that the critical section under MMContainer lock is too long. To fix that we have implemented an algorithm which utilize WaitContext to decrease the critical section. (which will be part of future PRs).The idea is to use
moving
(nowexclusive
) bit to synchronize eviction with SlabRelease (and in future, with readers). In this PR, I only changed howfindEviction
synchronizes with SlabRelease.This PR is a subset of: #132
The above PR introduced some performance regressions in the single-memory-tier version which we weren't able to fix yet, hence we decided to first implement this part (which should not affect performance) and later we can add separate path for multi-tier or try to optimize the original patch.