Skip to content

Conversation

@cbg-wx
Copy link

@cbg-wx cbg-wx commented Dec 22, 2025

Describe the issue this Pull Request addresses

  • This PR fixed HoodieMergedLogRecordScanner#processNextDeletedRecord, EventTimeAvroPayload#combineAndGetUpdateValue#needUpdatingPersistedRecord method, resovled merge delete record, tranform class issame before compareTo method when ordering field type is String or Decimal.
  • And add EventTimeAvroPayload#preCombine override this method to process delete record.

Summary and Changelog

This change involve the hudi-common module below method:

  • HoodieMergedLogRecordScanner#processNextDeletedRecord
  • EventTimeAvroPayload#combineAndGetUpdateValue#needUpdatingPersistedRecord
  • EventTimeAvroPayload#preCombine

Impact

  • Turn the delete record will always be chosen during merging,regardless of the ordering value into chose the record with max ordering value.
  • Tranform class issame before compareTo method when merged delete record with ordering field type is String or Decimal.

Risk Level

The logic that is not deleting records needs to be checked.

Documentation Update

none

Contributor's checklist

  • Read through contributor's guide
  • Enough context is provided in the sections above
  • Adequate tests were added if applicable

…ese is a record marked as delete record by _hoodie_is_deleted=true,the delete record will always be chosen during merging,regardless of the ordering value.
@github-actions github-actions bot added the size:L PR with lines of changes in (300, 1000] label Dec 22, 2025

Comparable curOrderingVal = oldRecord.getOrderingValue(this.readerSchema, this.hoodieTableMetaClient.getTableConfig().getProps());
Comparable deleteOrderingVal = deleteRecord.getOrderingValue();
if (curOrderingVal instanceof Utf8){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The payload classes are deprecated now (https://hudi.apache.org/releases/release-1.1.0#deprecation-of-hoodierecordpayload) and the old log record scanner is going to be removed. The new reader path with merge modes can handle the event time-based merging correctly. @cbg-wx could you check?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yihua this is a PR for 0.x-branch, should be a critical fix though.

@danny0405
Copy link
Contributor

cc @cshuo for reviewing this PR.

Copy link
Collaborator

@cshuo cshuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thks @cbg-wx for contributing, left some comments. BTW, can you also format the PR title, you can refer to the doc here.

@github-actions github-actions bot added size:M PR with lines of changes in (100, 300] and removed size:L PR with lines of changes in (300, 1000] labels Dec 24, 2025
@hudi-bot
Copy link
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@cshuo
Copy link
Collaborator

cshuo commented Dec 24, 2025

0001-fix-comments.patch

hi @cbg-wx, I've fixed some problems in the pr, could you apply the patch and push again. btw, plz use git rebase instead of git merge.

@cbg-wx cbg-wx closed this Dec 25, 2025
@cbg-wx cbg-wx deleted the feature_0_14_1_delete_record_merge branch December 25, 2025 07:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:M PR with lines of changes in (100, 300]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants