Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] Move convertToRawFiles method from TableScan to Split #2361

Merged
merged 2 commits into from
Nov 22, 2023

Conversation

tsreaper
Copy link
Contributor

Purpose

It would be more natural to check if a split can be read without merging directly from that split, not from other sources.

This PR moves convertToRawFiles method from TableScan to Split.

Tests

This PR is a refactor. Existing tests should cover the correctness.

API and Format

No.

Documentation

No.

}

@Override
public List<RawFile> rawFiles() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have two choices:

  1. Optional<List<RawFile>> convertToRawFiles.
  2. boolean needsMerging() and List<RawFile> rawFiles(), if need to merge, rawFiles should throws exception.

Maybe choice first is better.

@@ -89,6 +92,16 @@ public long rowCount() {
return rowCount;
}

@Override
public boolean needsMerging() {
return rawFiles.isEmpty();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, thinking about supporting chang-log tables for SR or Spark native engine, needsMerge should not depend on whether the list of rawFile is empty. We can depend on whether table is a change-log table or an append-on table for now. Further, even this's a change-log table, It should be judged by whether these documents overlap.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, we just introduce method to Split, this is a common method, just for non-merging reading, we can try to add more to cover more cases.

Copy link
Contributor

@JingsongLi JingsongLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@JingsongLi JingsongLi merged commit 19e9b13 into apache:master Nov 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants