db: compactions: calculate eventual output level #5333

RaduBerinde · 2025-09-18T00:14:19Z

db: minor compaction cleanup

We no longer calculate maxOutputFileSize and maxOverlapBytes
inside pickedTableCompaction, as they are not used there. We
calculate them in newCompaction instead.

We no longer store a maxReadCompactionBytes field; we calculate it
in the one place we need it.

db: compactions: calculate eventual output level

We improve the compaction code to check if the levels below the
compaction bounds are empty, in which case we adjust the output file
size and other sstable writer options to correspond to the eventual
level (after move compactions).

We no longer calculate `maxOutputFileSize` and `maxOverlapBytes` inside `pickedTableCompaction`, as they are not used there. We calculate them in `newCompaction` instead. We no longer store a `maxReadCompactionBytes` field; we calculate it in the one place we need it.

We improve the compaction code to check if the levels below the compaction bounds are empty, in which case we adjust the output file size and other sstable writer options to correspond to the eventual level (after move compactions).

cockroach-teamcity · 2025-09-18T00:14:28Z

This change is

jbowens

@jbowens reviewed 5 of 5 files at r1, 5 of 5 files at r2, all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @sumeerbhola)

sumeerbhola

@sumeerbhola reviewed 5 of 5 files at r1.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @RaduBerinde)

compaction.go line 276 at r2 (raw file):

	//
	// Because of move compactions, we know that any sstables produced by this
	// compaction will be later moved to eventualOutputLevel. So we use

(drive-by) I suppose something new could be compacted into this level, before the move compaction. Since the move compaction may be delayed if the level score is low. I wonder whether it would be better to just place it in the final output level instead of relying on such move compactions happening at the right time. And by making these files bigger while not putting them in the lowest level possible, we do have the small risk that the size of a compaction increases since now say a L2=>L3 compaction could encounter a much larger file in L3.

RaduBerinde

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @sumeerbhola)

compaction.go line 276 at r2 (raw file):

I suppose something new could be compacted into this level, before the move compaction.

How can that happen? I believe that the other compaction would necessarily have to overlap with this compaction on the outputLevel which is not allowed.

Since the move compaction may be delayed if the level score is low.

Good point..

wonder whether it would be better to just place it in the final output level instead of relying on such move compactions happening at the right time.

I can explore that. It would reduce some manifest churn as well. I think in that case we can update the ouptutLevel when we create the compaction, and make sure that any overlap checks include any intermediary levels. Can you think of anything else that might be a problem?

I will say that as I worked on this and tried to come up with a testcase without defining a specific LSM, I am starting to doubt the utility of this whole thing, especially since it's difficult to do it for flushes.

sumeerbhola

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @RaduBerinde)

compaction.go line 276 at r2 (raw file):

I can explore that. It would reduce some manifest churn as well

I haven't really given this more than passing thought, wrt whether there are write amp downsides to it.
It will require some care wrt coordination with ingests.

RaduBerinde · 2025-09-23T16:09:43Z

compaction.go line 276 at r2 (raw file):

Previously, sumeerbhola wrote…

I can explore that. It would reduce some manifest churn as well

I haven't really given this more than passing thought, wrt whether there are write amp downsides to it.
It will require some care wrt coordination with ingests.

Yeah, it looks like not an easy change. I will probably merge this as-is but after branch cut.

RaduBerinde added 2 commits September 17, 2025 11:30

db: compactions: calculate eventual output level

84d608d

We improve the compaction code to check if the levels below the compaction bounds are empty, in which case we adjust the output file size and other sstable writer options to correspond to the eventual level (after move compactions).

RaduBerinde requested review from jbowens and sumeerbhola September 18, 2025 00:14

RaduBerinde requested a review from a team as a code owner September 18, 2025 00:14

jbowens approved these changes Sep 19, 2025

View reviewed changes

sumeerbhola reviewed Sep 19, 2025

View reviewed changes

RaduBerinde commented Sep 19, 2025

View reviewed changes

sumeerbhola reviewed Sep 19, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

db: compactions: calculate eventual output level #5333

db: compactions: calculate eventual output level #5333

Uh oh!

RaduBerinde commented Sep 18, 2025

Uh oh!

cockroach-teamcity commented Sep 18, 2025

Uh oh!

jbowens left a comment

Uh oh!

sumeerbhola left a comment

Uh oh!

RaduBerinde left a comment

Uh oh!

sumeerbhola left a comment

Uh oh!

RaduBerinde commented Sep 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

db: compactions: calculate eventual output level #5333

Are you sure you want to change the base?

db: compactions: calculate eventual output level #5333

Uh oh!

Conversation

RaduBerinde commented Sep 18, 2025

db: minor compaction cleanup

db: compactions: calculate eventual output level

Uh oh!

cockroach-teamcity commented Sep 18, 2025

Uh oh!

jbowens left a comment

Choose a reason for hiding this comment

Uh oh!

sumeerbhola left a comment

Choose a reason for hiding this comment

Uh oh!

RaduBerinde left a comment

Choose a reason for hiding this comment

Uh oh!

sumeerbhola left a comment

Choose a reason for hiding this comment

Uh oh!

RaduBerinde commented Sep 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants