-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add simple heuristics for experimental mempurge. #8583
Add simple heuristics for experimental mempurge. #8583
Conversation
@bjlemaire has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
5b0eb03
to
e0040fe
Compare
@bjlemaire has updated the pull request. You must reimport the pull request before landing. |
@bjlemaire has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might call these "trivial" or "simple" heuristics but LGTM
Specifically, likely hits to CPU usage, read efficiency, and maximum burst write throughput, though not always so. |
@bjlemaire has updated the pull request. You must reimport the pull request before landing. |
@bjlemaire has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
…O print about mempurge/memflush execution times.
fccf5eb
to
a40adb1
Compare
@bjlemaire has updated the pull request. You must reimport the pull request before landing. |
@bjlemaire has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@bjlemaire merged this pull request in 4361d6d. |
Summary: Add `experimental_mempurge_policy` option flag and introduce two new `MemPurge` (Memtable Garbage Collection) policies: 'ALWAYS' and 'ALTERNATE'. Default value: ALTERNATE. `ALWAYS`: every flush will first go through a `MemPurge` process. If the output is too big to fit into a single memtable, then the mempurge is aborted and a regular flush process carries on. `ALWAYS` is designed for user that need to reduce the number of L0 SST file created to a strict minimum, and can afford a small dent in performance (possibly hits to CPU usage, read efficiency, and maximum burst write throughput). `ALTERNATE`: a flush is transformed into a `MemPurge` except if one of the memtables being flushed is the product of a previous `MemPurge`. `ALTERNATE` is a good tradeoff between reduction in number of L0 SST files created and performance. `ALTERNATE` perform particularly well for completely random garbage ratios, or garbage ratios anywhere in (0%,50%], and even higher when there is a wild variability in garbage ratios. This PR also includes support for `experimental_mempurge_policy` in `db_bench`. Testing was done locally by replacing all the `MemPurge` policies of the unit tests with `ALTERNATE`, as well as local testing with `db_crashtest.py` `whitebox` and `blackbox`. Overall, if an `ALWAYS` mempurge policy passes the tests, there is no reasons why an `ALTERNATE` policy would fail, and therefore the mempurge policy was set to `ALWAYS` for all mempurge unit tests. Pull Request resolved: facebook#8583 Reviewed By: pdillinger Differential Revision: D29888050 Pulled By: bjlemaire fbshipit-source-id: e2cf26646d66679f6f5fb29842624615610759c1
Add
experimental_mempurge_policy
option flag and introduce two newMemPurge
(Memtable Garbage Collection) policies: 'ALWAYS' and 'ALTERNATE'. Default value: ALTERNATE.ALWAYS
: every flush will first go through aMemPurge
process. If the output is too big to fit into a single memtable, then the mempurge is aborted and a regular flush process carries on.ALWAYS
is designed for user that need to reduce the number of L0 SST file created to a strict minimum, and can afford a small dent in performance (possibly hits to CPU usage, read efficiency, and maximum burst write throughput).ALTERNATE
: a flush is transformed into aMemPurge
except if one of the memtables being flushed is the product of a previousMemPurge
.ALTERNATE
is a good tradeoff between reduction in number of L0 SST files created and performance.ALTERNATE
perform particularly well for completely random garbage ratios, or garbage ratios anywhere in (0%,50%], and even higher when there is a wild variability in garbage ratios.This PR also includes support for
experimental_mempurge_policy
indb_bench
.Testing was done locally by replacing all the
MemPurge
policies of the unit tests withALTERNATE
, as well as local testing withdb_crashtest.py
whitebox
andblackbox
. Overall, if anALWAYS
mempurge policy passes the tests, there is no reasons why anALTERNATE
policy would fail, and therefore the mempurge policy was set toALWAYS
for all mempurge unit tests.