HBASE-29716 Include sequence ID on incremental backup HFiles #221

krconv · 2025-11-25T00:34:03Z

The HFiles generated by incremental backups cannot be properly read by tooling such as the ClientSideRequestScanner, because the generated HFiles do not include the MAX_SEQ_ID metadata. The scanner will ignore cell-level sequence IDs and instead sort the HFiles arbitrarily. This causes incorrect results when scanning overwrites to cells with the same timestamp.

This PR adds a new option to the HFileOutputFormat2 that will calculate and set the required metadata. This only really effects the ClientSideRequestScanner, as the sequence ID will be recalculated when bulk-loaded anyways.

Part of https://issues.apache.org/jira/browse/HBASE-29716

Upstream PR: apache#7480

hgromer · 2025-11-25T14:05:28Z

Looks good for HS, aside from a minor comment I left here

HBASE-29716 Include sequence ID on incremental backup HFiles

046cb69

krconv force-pushed the HBASE-29716-set-sequence-id-option-2.6 branch from daf2dff to 046cb69 Compare November 25, 2025 02:13

krconv requested review from hgromer and sidkhillon November 25, 2025 10:31

hgromer approved these changes Nov 25, 2025

View reviewed changes

krconv merged commit ad62e7f into hubspot-2.6 Nov 25, 2025
1 check passed

krconv deleted the HBASE-29716-set-sequence-id-option-2.6 branch November 25, 2025 19:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

HBASE-29716 Include sequence ID on incremental backup HFiles #221

HBASE-29716 Include sequence ID on incremental backup HFiles #221

krconv commented Nov 25, 2025

Uh oh!

hgromer commented Nov 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

HBASE-29716 Include sequence ID on incremental backup HFiles #221

HBASE-29716 Include sequence ID on incremental backup HFiles #221

Conversation

krconv commented Nov 25, 2025

Uh oh!

hgromer commented Nov 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants