-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support sub compaction to speed up large compaction #70
Conversation
Can we only use the bottom level bounds to split the compaction? |
@coocood This heuristic algorithm is adopted from RocksDB.
We should consider the size of each input file, so we cannot ignore the bounds of L0 files. Because each sst file has nearly the same size, each boundary added in Supposing L0 inputs Then we estimate the size of each small range, compute the estimated size of each sub compaction. Merge small ranges into a larger one. This algorithm is not so determinate, because we cannot split the whole input equally without iterate over them, so we use some heuristic rules here, and it worked out well so far. BTW, RocksDB disabled sub compaction for non-L0 levels, but I enabled this for L1 when it touches more than 10 sst, otherwise it may block the L0 -> L1 compaction. |
Ping @coocood |
@bobotu |
LGTM |
Use sub compaction to avoid write stall due to large L1 to L2 compaction block L0 compaction, and speed up L0 to L1 compaction.