-
Notifications
You must be signed in to change notification settings - Fork 4
[upstream] Initial framework for background item eviction/promotion #43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
If and when this PR is merged into upstream we can followup with PRs that:
|
Is it better to set default values for interval and strategy ? I understand doing so will require a background thread attributes class since multiple default values won't be possible. |
I see now that default values are set in config. So maybe you can ignore my previous comment about background evictor/promoter thread creation api carrying default values for interval and strategy. |
Should this class be renamed to BackgroundMoverLocalStats ? I think adding the term "local" adds clarity to the per-thread stats collection mode. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall the PR looks good. Please review the few cosmetic comments.
Reviewed 14 of 14 files at r1, all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @igchor)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @guptask)
cachelib/allocator/CacheAllocator.h
line 1063 at r1 (raw file):
Previously, guptask (Sounak Gupta) wrote…
Is it better to set default values for interval and strategy ? I understand doing so will require a background thread attributes class since multiple default values won't be possible.
e.g. bool startNewBackgroundEvictor(BackgroundThreadOpts opts);
We don't have too many params, so I think it's fine as it is for now. startNewPoolOptimizer
and others also don't have default params.
cachelib/allocator/CacheAllocatorConfig.h
line 480 at r1 (raw file):
Previously, guptask (Sounak Gupta) wrote…
I see now that default values are set in config. So maybe you can ignore my previous comment about background evictor/promoter thread creation api carrying default values for interval and strategy.
Yeah, I think it's probably better to have the default values here and perhaps in enableBackgroundEvictor
function as well eventually.
cachelib/allocator/CacheStats.h
line 293 at r1 (raw file):
Previously, guptask (Sounak Gupta) wrote…
Should this class be renamed to BackgroundMoverLocalStats ? I think adding the term "local" adds clarity to the per-thread stats collection mode.
Well, we're using this class to aggregate stats from all Bg workers at the end so I'm not sure. This is done inside getBackgroundMoverStats
Summary: GitHub commits: facebook/wangle@43e5620 facebookexperimental/edencommon@3934bec facebookincubator/katran@a4c94aa Reviewed By: jurajh-fb fbshipit-source-id: 720310662068e4aa245198d817666700c9dde1d5
Summary: GitHub commits: facebook/fb303@c51a290 facebook/fbthrift@5e97023 facebook/proxygen@3710a32 facebook/wangle@fd29800 facebook/watchman@fca53d2 facebookexperimental/edencommon@3d8a6d9 facebookexperimental/rust-shed@75c5d75 facebookincubator/fizz@bae2d7e facebookincubator/katran@139c75d facebook/mvfst@001332d facebookincubator/velox@ec0ea1f Reviewed By: jurajh-fb fbshipit-source-id: a8a8ff6a7fddeaa7bbccb3faa6a737c50d22cfe2
Summary: As mentioned in the tasks, these stats make more sense in rates. Reviewed By: jiayuebao Differential Revision: D43243498 fbshipit-source-id: 556b0b4a062235bb3b34dd076e5f7a5a88316f8c
Summary: GitHub commits: facebook@ba9dd68 facebook/fbthrift@23208ff facebookincubator/velox@68b8a45 Reviewed By: jurajh-fb fbshipit-source-id: f8581e13cd1c0a2510d87f42c813ca2c65eaef1d
Summary: GitHub commits: facebook/fbthrift@3b00d58 Reviewed By: jurajh-fb fbshipit-source-id: 23ecb07e7e9ebd4dfbf3980de141ea1ebe36bb12
Summary: GitHub commits: facebook/fbthrift@df0eb61 facebook/proxygen@7fc3fea Reviewed By: jurajh-fb fbshipit-source-id: fa982baf35fd98edc52be391f03d15a9c2c333bf
Summary: GitHub commits: facebook/fbthrift@1d01d76 Reviewed By: jurajh-fb fbshipit-source-id: bdefba88e8d002c65d8b7c11e86c8aa772996d9b
Summary: GitHub commits: facebook/fbthrift@1b848d8 facebook/rocksdb@cfe50f7 pytorch/FBGEMM@c55d800 Reviewed By: bigfootjon fbshipit-source-id: e840d1116311ac52336e598442cf66cdc50d7725
Summary: GitHub commits: facebook/fbthrift@90fddab Reviewed By: bigfootjon fbshipit-source-id: cf7ac48c00c548805fb74663fe3e9bd9f5906cc2
Summary: Currently `replacedPtr` is passed as an argument and such usage isn't easy to understand (we kept receiving user feedback on that). Now we want to return it. Reviewed By: therealgymmy, antonf, jaesoo-fb Differential Revision: D43143989 fbshipit-source-id: eaae055107050cd67a7ca8b26b716fffc0c002f2
Summary: GitHub commits: facebook/folly@fdb333b facebook/watchman@0b238b0 pytorch/FBGEMM@d142cdf Reviewed By: bigfootjon fbshipit-source-id: ac185dc9dc24ba7d2765dca010958ce6b38bd029
Summary: GitHub commits: facebook/fbthrift@cbc3de5 facebook/rocksdb@229297d facebook/watchman@d873e11 facebookexperimental/rust-shed@2f45b88 Reviewed By: bigfootjon fbshipit-source-id: a65fa2098f6b6f1704154e0509e2f5423f9679eb
Summary: Change build script to pin fmt version to same as folly's to minimize future breaks. Pull Request resolved: facebook#196 Test Plan: Built successfully on a fresh clone of CacheLib. (Also had to change `external_git_branch=dev` for zstd to deal with the cmake/zstd issue in facebook#194, but that should resolve when gets merged into release) **Context:** OSS build broke between 3-5 Jan 2023, likely due to changes in folly. While switching to v9.1.0 or 9.0.0 fixes the issue at hand, it seems sensible to match folly, which specifies fmt v8.0.1: https://github.com/facebook/folly/blob/main/build/fbcode_builder/manifests/fmt > facebook#62 agordon: For the other packages, you'll notice we do use a specific git tag or branch… I notice `fmt` is an exception - not pinned to a specific git tag or revision - likely an omission that can be fixed. Related CacheLib issues: facebook#186, facebook#189, facebook#107, facebook#97, facebook#62 Possibly related CacheLib commit: 67cc11a Last working (Jan 3): https://github.com/facebook/CacheLib/actions/runs/3826992478 First failed (Jan 5): https://github.com/facebook/CacheLib/actions/runs/3844002307/jobs/6546742348 Error: `error: static assertion failed: Cannot format an argument. To make type T formattable provide a formatter<T> specialization: https://fmt.dev/latest/api.html#udt` Reviewed By: therealgymmy Differential Revision: D43517927 Pulled By: jiayuebao fbshipit-source-id: 2d28791f7804d862b646263b96b10b835f843d8c
Summary: Memcached's WSA logger will now emits GET_LEASE and SET_LEASE operations as well. This changes makes the cachebench treats those as GET and SET, respectively, for compatibility. Reviewed By: therealgymmy Differential Revision: D43336316 fbshipit-source-id: c9b842d567b9fb2128b822bd429f5dce30b378da
Summary: update to >6.0.1 version Reviewed By: antonk52 Differential Revision: D43575577 fbshipit-source-id: 7143da212ab6f124bffdfdaf3be29ff3ab986ffb
Summary: Fix OSS builds by adding numa deps to build files. Currently some fail on missing `numa.h`. Context: facebook#161 added the dependencies to the centOS, debian, and ubuntu18 build files. The PR was opened in Sep 2022 but only landed in Dec 2022, and so probably missed out on the fedora, rocky and arch build files which were added in-between those dates. Having had those build actions run on PRs would have caught this (currently, they are only scheduled.) Pull Request resolved: facebook#197 Test Plan: Github Actions builds (ideally, facebook#198 would be landed first.) I've checked that those packages exist for the respective repositories but didn't run them myself. Reviewed By: jaesoo-fb Differential Revision: D43587970 Pulled By: jiayuebao fbshipit-source-id: 8c59e48528042350e576a45ffc3bf2520699f5a9
Reviewed By: avalonalex Differential Revision: D43616310 fbshipit-source-id: 3367ad01ba31e5dc561d63a4f3b9746170e64912
Summary: Run GitHub action builds on every pull request, in addition to currently daily scheduled runs. Benefit: avoid accidentally breaking other OS builds. This would have caught facebook#197. This triggers whenever PRs are opened or when commits are added to a PR. Frequency of PRs (total of 76 PRs over last 18 months since CacheLib was open-sourced) is much less than frequency of daily scheduled builds, so this shouldn't add too many builds overall. Pull Request resolved: facebook#198 Reviewed By: therealgymmy Differential Revision: D43625609 Pulled By: jiayuebao fbshipit-source-id: 1572f6da32584ce6a1983d5e64afedf17ff17457
Summary: Add a custom deleter class that stores a `handle`. This allows object-cache to access the Item Handle via a shared_ptr. Both size-awareness feature and getting/updating object's TTL require that. Deleter class is marked as private because we don't want to expose `Handle` to object-cache users. Reviewed By: therealgymmy, jaesoo-fb Differential Revision: D42503594 fbshipit-source-id: 16ac14e6a84a1cfa80a3c145d440790002734a34
Summary: This change fixes following flaky tests in NvmCacheTests. * NvmCacheTest.Delete * NvmCacheTest.NvmEvicted * NvmCacheTest.EvictToNvmGetCheckCtime The root cause of the failures are essentially the same as D42443647 (facebook@5e7ff9a) which fixed the problem for NvmCacheTest.EvictToNvmGet; we are inserting enough items that could be spilled to NVM cache, where the NvmCache::put() can be dropped and the item is evicted completely when the delete operations (and tombstones) issued as part of the insertion are still outstanding. In order to fix the problem, this change flushes the NVM cache periodically during the insertions. Also, since this could cause more regions are used, the size of NVM cache needs to be increased. This change bumps the default size of NVM cache to 200MB (previous 100MB). Also, the size of persist storage used in the test PersistenceCache has been bumped by 100MB accordingly, i.e., from 400MB to 500MB. This change addresses the github issue facebook#169 Reviewed By: therealgymmy Differential Revision: D43592888 fbshipit-source-id: f0968884eb39fb5728b59129e98345df3240f01e
Summary: The Samsung PM9A3 does not report samsung in the model number so I added the specific model number to the vendorMap. Reviewed By: jaesoo-fb Differential Revision: D43676582 fbshipit-source-id: 6df19c40dd9da9563b75aa5847a1d1f9eb6a9aef
Summary: XDCHECK (or XCHECK) fails here for gcc-8.x in a lambda, so we move it outside. This occurs in CacheBench's AsyncCacheStressor.h. ``` /__w/CacheLib/CacheLib/cachelib/../cachelib/cachebench/runner/AsyncCacheStressor.h:306:7: internal compiler error: in cp_build_addr_expr_1, at cp/typeck.c:5965 ``` This line compiled fine with 8.5 before Jan 2023 so I suspect a regression in folly or other external library. It still compiles fine with gcc-7.5, 9.4, and 11.3.1. Possibly related commits: facebook/folly@1aafad4 and facebook/folly@e6d09f6 Line of gcc-8.5 that it fails on: https://github.com/gcc-mirror/gcc/blob/releases/gcc-8.5.0/gcc/cp/typeck.c#L5965 The version that isn't in a lambda compiles just fine: https://github.com/facebook/CacheLib/blob/df5b9f6ef35c55e432b6713c52397a03dd19c34c/cachelib/cachebench/runner/CacheStressor.h#L399 Pull Request resolved: facebook#201 Test Plan: GitHub actions. Built fine on my fork with CentOS 8.5/gcc-8.5. This issue currently causes 3 builds that use gcc-8.5 to fail (2 CentOS and RockyLinux-8.6) and 1 build using gcc-8.3 (Debian). Reviewed By: therealgymmy Differential Revision: D43681854 Pulled By: jaesoo-fb fbshipit-source-id: f3a65aefedcd98a26a80bb6ad009ad0d64e2395b
Summary: Add the following TTL-related APIs: - getConfiguredTtlSec - getExpiryTimeSec - extendTtlSec - updateExpiryTimeSec Usage: ``` auto ptr = objcache->find<T>("key"); auto configuredTtl = objcache->getConfiguredTtl(ptr); auto expiryTime = objcache->getExpiryTimeSec(ptr); objcache->extendTtl(ptr, std::chrono::seconds(3)); objcache->updateExpiryTimeSec(ptr, newExpiryTimeSecs); ``` Reviewed By: therealgymmy, jaesoo-fb Differential Revision: D43167879 fbshipit-source-id: 3b11fb0a2b9a3b5c38fcfd856ade100e6ae27470
Summary: Based on our discussion, this API would be confusing if a reference of ReadHandle is obtained from a WriteHandle. We also don't want to make it virtual because this will increase `sizeof(ReadHandle)` / `sizeof(WriteHandle)` by 8 bytes. Reviewed By: therealgymmy Differential Revision: D43667308 fbshipit-source-id: a77a17113f1a23f84332ebfa7ea6772d7647339c
Summary: This diff has been automatically generated by the inpage editor. NOTE: If you want to update this diff, go via the preview link inside the static docs section below. Ensure you are editing the same page that was used to create this diff. Reviewed By: therealgymmy Differential Revision: D43667238 fbshipit-source-id: 0be4c1ef376a5a1a2de92afc311af24f66d10afd
Summary: 1. Workaround for Debian Docker image bug that is breaking Debian build on GitHub (Explicitly mark Git repo as safe). 2. Pin zstd to a commit that resolves problems with older CMakes (note: affects all OSes, not just Debian) Context for 1: In latest Debian Docker image , there is a regression that affects the checkout action. From actions/checkout#1169: > - Checkout runs, and runs /usr/bin/git config --global --add safe.directory <path> > - The global .gitconfig does not exist > - Any calls to git remain unsafe/dubious The suggested workaround was to use --system instead of --global. Pull Request resolved: facebook#200 Test Plan: See if GitHub Action Debian build is fixed. Reviewed By: therealgymmy Differential Revision: D43720363 Pulled By: jaesoo-fb fbshipit-source-id: 54f3586cc7f8e72045e60d8dd454c7a77725e6b2
Summary: An user in github reported an issue that the installation link is broken. Somehow, docs/installation/installation.md cannot be referenced by `/docs/installation/installation`, but only by `/docs/installation`. The root cause could not figured out yet, but this change fixes it as such for now. Reviewed By: jiayuebao Differential Revision: D43757739 fbshipit-source-id: 4abd3208800c0b68e9162d381f6395897f047b24
Summary: There is another missing link in index.js for installation page. This change fixes it. Reviewed By: jiayuebao Differential Revision: D43774697 fbshipit-source-id: 1c2ff1c801ae41ea6b50e6e687714f422bcb0c92
Summary: We need to fix builds for the entire Folly stack first before this will work again Pull Request resolved: facebook#205 Reviewed By: michel-slm, jiayuebao Differential Revision: D43786012 Pulled By: jaesoo-fb fbshipit-source-id: dfd1947d7d2c294fc622496d6054c2395ef0d5a3
…itor Summary: This diff has been automatically generated by the inpage editor. NOTE: If you want to update this diff, go via the preview link inside the static docs section below. Ensure you are editing the same page that was used to create this diff. Reviewed By: therealgymmy Differential Revision: D43781316 fbshipit-source-id: 18091011f9e4592240fb41c5fb9cc8cb40e62205
Reviewed By: jiayuebao Differential Revision: D43869027 fbshipit-source-id: 6f805a86b072e2fc96d4357672175e8283e06d0d
Summary: As suggested by wonglkd in facebook#199, this change adds the build status for all targets in the main README.md (https://github.com/facebook/CacheLib) Reviewed By: therealgymmy Differential Revision: D43854616 fbshipit-source-id: add32de29c160ed06e1778fef41bb37ffc359fd7
Summary: **Context:** To enable user/intern simulations, D41134915 added 'itemValue' support in cachebench simulations. Any string that appears in the 'itemValue' column in the trace, is used to populate the cachelib item value (previously empty). For user/intern, we set this string to "0" for user and "1" for intern. Then in FBDep.cpp, we tell the admission policy to distinguish user/intern with a lambda function that parses item value. Prod uses a different lambda that reads item *header*, which has a user/intern flag set by Proxygen as part of the request to BigCache. We cannot use the same function and item header flag in simulations because the simulator does not run Proxygen. Instead, it emulates its behavior with trace + PieceWiseReplayGenerator. **This diff:** Currently, the logic that populates item value in cachebench (setStringItem below) takes a std::string and sets the item value with that. When we want to read the item in FBDep, we don't know where the end of the string is because the item can be bigger than the data. Right now, we're avoiding this by adding a null character at the end of the itemValue data string. FBDep then uses strnlen to find the length of the data until the null string. **This is not working as intended**, and the string read in FBDep includes gibberish after the "0" or "1" because it is not properly terminated. This diff fixes the issue. Reviewed By: jaesoo-fb Differential Revision: D43861361 fbshipit-source-id: d4974281e28165ecf8c23c44dc4a6b50d5b1b806
Differential Revision: D43869116 fbshipit-source-id: c62c2518691313ec82c15c9f65ac304378222257
Differential Revision: D43869054 fbshipit-source-id: e90b16576404b8eb5819c65d61fe972f0f518c66
Differential Revision: D43869046 fbshipit-source-id: 288bc98b8146424f71ece5d0685d08c59e3ed523
Differential Revision: D43869073 fbshipit-source-id: 52fea96db9cefe735ba56b564bf6af5566cd6d76
Differential Revision: D43869045 fbshipit-source-id: 120149fa4cac2590b93f0c9a9fa5f7827efca3fc
Summary: It is similar to 'moving' but requires ref count to be 0. An item which is marked for eviction causes all incRef() calls to that item to fail. This will be used to ensure that once item is selected for eviction, no one can interfere and prevent the eviction from suceeding. 'markedForEviction' relies on the same 'exlusive' bit as the 'moving' state. To distinguish between those two states, 'moving' add 1 to the refCount. This is hidden from the user, so getRefCount() will not return that extra ref. Pull Request resolved: facebook#183 Test Plan: flatmemcache shadow result: It seems there is a slight latency regression in allocation (~1 micro second) https://fburl.com/ods/u3isn8b4 It's probably small enough to ignore. But we can come back if this starts showing up in large scale in production. Imported from GitHub, without a `Test Plan:` line. Reviewed By: therealgymmy Differential Revision: D42089974 Pulled By: haowu14 fbshipit-source-id: 90477faa9443080a7ddefa37a524965a04b6f084
Differential Revision: D43869057 fbshipit-source-id: 84aa3aec8d75f04b640d6aa8393002190c97842f
Differential Revision: D43869089 fbshipit-source-id: 57d359fa8cfeaca395f636f6d3997a934afca94a
Differential Revision: D43904241 fbshipit-source-id: 3d7484bee3a62495b2f2ee299fd3b0b855a3be58
Summary: Putting extra note of separating read/write traffic when using find APIs. Reviewed By: jaesoo-fb Differential Revision: D43960690 fbshipit-source-id: 56d3a8b65c7b66b3319bee6e7ffdd959826c8dcf
Implement stubs for periodic workers for item eviction/promotion. Actual implementation of eviction and promotion will be provided in the next patch. The idea behing introducing those workers is the following: - BG eviction: to keep certain amount of free memory and decreasing allocate latency - BG promotion: to move hot items to memory decresing latency
Compensation results in ratios being different than originially specified.
Compensation results in ratios being different than originially specified.
Implement stubs for periodic workers for item eviction/promotion. Actual implementation of eviction and promotion will be provided in the next patch.
The idea behing introducing those workers is the following:
This change is