Description
Our shorter critical section during will perform better than the longer upstream version when the hashtable is small since it will not iterate through hashtable collisions with the mmContainer lock held. Performance gains with a shorter critical section tend to dissapate as hashtable size increases since the longer upstream version will be iterating through less collision entries (and therefore the critical section is not as long with higher hashtable power).
The purpose of this experiment is to examine the tradeoff of using shorter critical section with smaller hashtables against the upstream longer critical section with a larger hashtable. In our experiement we limit the total RSS of cachebench to 16GB (including cache size and hashtable overhead).
Configs
- Config 1 use (upstream) - set cache size to 8GB and htBucketPower to 30
- Config 2 use (upstream) - set cache size to 15.75GB and htBucketPower to 25
- Config 3 use (critical section patch) - set cache size to 8GB and htBucketPower to 30
- Config 4 use (critical section patch) - set cache size to 15.75GB and htBucketPower to 25
Use graph_cache_leader_fbobj and graph_cache_follower_fbobj workloads with default parameters other than the above. Therefore there will be 8 experiments in total: 2 workloads * 4 different configs.
Upstream: https://github.com/facebook/CacheLib/tree/main, commit ba170d0
edit: use CacheLib branch here: https://github.com/igchor/CacheLib-1/tree/optimize_mmcontainer_locking?