-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add OpenAndTrimHistory API to support trimming data with specified timestamp #9410
Conversation
Since the previous discussion was with @riversand963 , assigning to @riversand963 to triage. |
Thanks @sunlike-Lipo for the PR and sorry for late reply. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sunlike-Lipo for the PR. I left a few comments.
42de914
to
3b0a0c4
Compare
@riversand963 |
@riversand963 |
Thanks @sunlike-Lipo for the PR. I will try to review this week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sunlike-Lipo for the improvement.
How do you handle data in the WAL if avoid_flush_during_recovery
is true? If possible, these data will remain in memtable.
Maybe a few options to consider:
I think 1 is good for now, but add TODO for the other two. |
Thanks @riversand963 review. I will provide a new commits this weekend |
* Skip column families that do not enable user-defined timestamp, * Add validateOptions operation in OpenAndTrimHistory, * Modified some comments in the api to clarify that the trim operation of this api is only valid for the column family with timestamp enabled, * Added cleanup operation in case of failure, clean up db and column family
@riversand963 1 has been fixed, 2, 3 also added TODO. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sunlike-Lipo for the PR. Left some comments and we are getting close.
@riversand963 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Fix review comments
@sunlike-Lipo has updated the pull request. You must reimport the pull request before landing. |
@@ -25,7 +27,7 @@ class HistoryTrimmingIterator : public InternalIterator { | |||
return true; | |||
} | |||
Slice current_ts = ExtractTimestampFromKey(key(), cmp_->timestamp_size()); | |||
return cmp_->CompareTimestamp(current_ts, Slice(filter_ts_)) < 0; | |||
return cmp_->CompareTimestamp(current_ts, Slice(filter_ts_)) <= 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the API's definition, data with ts <= filter_ts_ should be reserved, I added a unittest for the =
case in the last commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sunlike-Lipo for the PR! LGTM. Can you resolve the conflicts and rebase?
@sunlike-Lipo has updated the pull request. You must reimport the pull request before landing. |
1 similar comment
@sunlike-Lipo has updated the pull request. You must reimport the pull request before landing. |
@sunlike-Lipo has updated the pull request. You must reimport the pull request before landing. |
@riversand963 |
@riversand963 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@sunlike-Lipo has updated the pull request. You must reimport the pull request before landing. |
@riversand963 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@sunlike-Lipo has updated the pull request. You must reimport the pull request before landing. |
@riversand963 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
As disscussed in (#9223), Here added a new API named DB::OpenAndTrimHistory, this API will open DB and trim data to the timestamp specofied by trim_ts (The data with newer timestamp than specified trim bound will be removed). This API should only be used at a timestamp-enabled db instance recovery.
And this PR implemented a new iterator named HistoryTrimmingIterator to support trimming history with a new API named DB::OpenAndTrimHistory. HistoryTrimmingIterator wrapped around the underlying InternalITerator such that keys whose timestamps newer than trim_ts should not be returned to the compaction iterator while trim_ts is not null.