Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for range deletion when user timestamps are not persisted #12254

Closed
wants to merge 2 commits into from

Conversation

jowlyzhang
Copy link
Contributor

For the user defined timestamps in memtable only feature, some special handling for range deletion blocks are needed since both the key (start_key) and the value (end_key) of a range tombstone can contain user-defined timestamps. Handling for the key is taken care of in the same way as the other data blocks in the block based table. This PR adds the special handling needed for the value (end_key) part. This includes:

  1. On the write path, when L0 SST files are first created from flush, user-defined timestamps are removed from an end key of a range tombstone. There are places where it's logically removed (replaced with a min timestamp) because there is still logic with the running comparator that expects a user key that contains timestamp. And in the block based builder, it is eventually physically removed before persisted in a block.

  2. On the read path, when range deletion block is being read, we artificially pad a min timestamp to the end key of a range tombstone in BlockBasedTableReader.

  3. For file boundary FileMetaData.largest, we artificially pad a max timestamp to it if it contains a range deletion sentinel. Anytime when range deletion end_key is used to update file boundaries, it's using max timestamp instead of the range tombstone's actual timestamp to mark it as an exclusive end.

    rocksdb/db/dbformat.h

    Lines 923 to 935 in d69628e

    InternalKey SerializeEndKey() const {
    if (!ts_.empty()) {
    static constexpr char kTsMax[] = "\xff\xff\xff\xff\xff\xff\xff\xff\xff";
    if (ts_.size() <= strlen(kTsMax)) {
    return InternalKey(end_key_, kMaxSequenceNumber, kTypeRangeDeletion,
    Slice(kTsMax, ts_.size()));
    } else {
    return InternalKey(end_key_, kMaxSequenceNumber, kTypeRangeDeletion,
    std::string(ts_.size(), '\xff'));
    }
    }
    return InternalKey(end_key_, kMaxSequenceNumber, kTypeRangeDeletion);
    }

    This max timestamp is removed when in memory FileMetaData.largest is persisted into Manifest, we pad it back when it's read from Manifest while handling related VersionEdit in VersionEditHandler.

Test Plan:
Added unit test and enabled this feature combination's stress test.

@jowlyzhang jowlyzhang force-pushed the range_del branch 3 times, most recently from 3d68c14 to 092ff65 Compare January 19, 2024 19:51
@jowlyzhang jowlyzhang requested a review from cbi42 January 19, 2024 21:08
@facebook-github-bot
Copy link
Contributor

@jowlyzhang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Member

@cbi42 cbi42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Left some non-blocking comments.

db/builder.cc Show resolved Hide resolved
db/db_with_timestamp_basic_test.cc Outdated Show resolved Hide resolved
db/db_with_timestamp_basic_test.cc Outdated Show resolved Hide resolved
@cbi42
Copy link
Member

cbi42 commented Jan 25, 2024

I suggest run stress test with a high delrangepercent and/or different range_deletion_width for test plan.

@facebook-github-bot
Copy link
Contributor

@jowlyzhang has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@jowlyzhang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@jowlyzhang merged this pull request in 071a146.

@jowlyzhang jowlyzhang deleted the range_del branch January 30, 2024 01:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants