Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pin CacheEntryStatsCollector to fix performance bug #8385

Closed

Conversation

pdillinger
Copy link
Contributor

@pdillinger pdillinger commented Jun 10, 2021

Summary: If the block Cache is full with strict_capacity_limit=false,
then our CacheEntryStatsCollector could be immediately evicted on
release, so iterating through column families with shared block cache
could trigger re-scan for each CF. This change fixes that problem by
pinning the CacheEntryStatsCollector from InternalStats so that it's not
evicted.

I had originally thought that this object could participate in LRU like
everything else, but even though a re-load+re-scan only touches memory,
it can be orders of magnitude more expensive than other cache misses.
One service in Facebook has scans that take ~20s over 100GB block cache
that is mostly 4KB entries. (The up-side of this bug and #8369 is that
we had a natural experiment on the effect on some service metrics even
with block cache scans running continuously in the background--a kind
of worst case scenario. Metrics like latency were not affected enough
to trigger warnings.)

Other smaller fixes:

20s is already a sizable portion of 600s stats dump period, or 180s
default max age to force re-scan, so added logic to ensure that (for
each block cache) we don't spend more than 0.2% of our background thread
time scanning it. Nevertheless, "foreground" requests for cache entry
stats (calls to db->GetMapProperty(DB::Properties::kBlockCacheEntryStats))
are permitted to consume more CPU.

Renamed field to cache_entry_stats_ to match code style.

This change is intended for patching in 6.21 release.

Test Plan: unit test expanded to cover new logic (detect regression),
some manual testing with db_bench

Summary: If the block Cache is full with strict_capacity_limit=false,
then our CacheEntryStatsCollector could be immediately evicted on
release, so iterating through column families with shared block cache
could trigger re-scan for each CF. This change fixes that problem by
pinning the CacheEntryStatsCollector from InternalStats so that it's not
evicted.

I had originally thought that this object could participate in LRU like
everything else, but even though a re-load+re-scan only touches memory,
it can be orders of magnitude more expensive than other cache misses.
One service in Facebook has scans that take ~20s over 100GB block cache
that is mostly 4KB entries. (The up-side of this bug and facebook#8369 is that
we had a natural experiment on the affect on some service metrics even
with block cache scans running continuously in the background--a kind
of worst case scenario. Metrics like latency were not affected enough
to trigger warnings.)

Other smaller fixes:

20s is already a sizeable portion of 600s stats dump period, or 180s
default max age to force re-scan, so added logic to ensure that (for
each block cache) we don't spend more than 1% of our background thread
time scanning it.

Renamed field to cache_entry_stats_ to match code style.

This change is intended for patching in 6.21 release.

Test Plan: unit test expanded to cover new logic (detect regression)
@facebook-github-bot
Copy link
Contributor

@pdillinger has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@pdillinger has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@pdillinger has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@ajkr ajkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, adding Cache::Priority::HIGH and the percentage-based gap between scans seem like improvements.

I didn't quite understand if the goal is to entirely prevent edge cases of high collection frequency or make them less likely. My current understanding is it does the latter.

max_age_micros = std::max(
max_age_micros, min_interval_factor * (last_end_time_micros_ -
last_start_time_micros_));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this is easier to understand than the original. I guess that's due to only using std::max().

// difficult to access that setting from here with just cfd_
collector->GetStats(&cache_entry_stats);
Status InternalStats::CollectCacheEntryStats(bool foreground) {
// Lazy initialize/reference the collector. It is pinned in cache (through
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming you mean pinned in memory, please correct if I'm mistaken...

I wonder if different InternalStatses could have different collectors forever if the initial collector each one creates is quickly ejected.

@ajkr
Copy link
Contributor

ajkr commented Jun 12, 2021

Damn, right after submitting my review I remembered about MakeSharedCacheHandleGuard. Ignore my doubts, I agree it should entirely prevent edge cases of high collection frequency.

@facebook-github-bot
Copy link
Contributor

@pdillinger merged this pull request in d5a46c4.

pdillinger added a commit that referenced this pull request Jun 14, 2021
Summary:
If the block Cache is full with strict_capacity_limit=false,
then our CacheEntryStatsCollector could be immediately evicted on
release, so iterating through column families with shared block cache
could trigger re-scan for each CF. This change fixes that problem by
pinning the CacheEntryStatsCollector from InternalStats so that it's not
evicted.

I had originally thought that this object could participate in LRU like
everything else, but even though a re-load+re-scan only touches memory,
it can be orders of magnitude more expensive than other cache misses.
One service in Facebook has scans that take ~20s over 100GB block cache
that is mostly 4KB entries. (The up-side of this bug and #8369 is that
we had a natural experiment on the effect on some service metrics even
with block cache scans running continuously in the background--a kind
of worst case scenario. Metrics like latency were not affected enough
to trigger warnings.)

Other smaller fixes:

20s is already a sizable portion of 600s stats dump period, or 180s
default max age to force re-scan, so added logic to ensure that (for
each block cache) we don't spend more than 0.2% of our background thread
time scanning it. Nevertheless, "foreground" requests for cache entry
stats (calls to `db->GetMapProperty(DB::Properties::kBlockCacheEntryStats)`)
are permitted to consume more CPU.

Renamed field to cache_entry_stats_ to match code style.

This change is intended for patching in 6.21 release.

Pull Request resolved: #8385

Test Plan:
unit test expanded to cover new logic (detect regression),
some manual testing with db_bench

Reviewed By: ajkr

Differential Revision: D29042759

Pulled By: pdillinger

fbshipit-source-id: 236faa902397f50038c618f50fbc8cf3f277308c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants