Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spark 3.0: Fix delete from snapshot of table #3840

Merged
merged 2 commits into from
Jan 4, 2022

Conversation

izchen
Copy link
Contributor

@izchen izchen commented Jan 4, 2022

Fix UT case testDeleteFromTableAtSnapshot. This UT failed in spark version 3.0. I think this is because the [SPARK-33623][SQL] Add canDeleteWhere to SupportsDelete does not exist in 3.0.

@github-actions github-actions bot added the spark label Jan 4, 2022
@izchen
Copy link
Contributor Author

izchen commented Jan 4, 2022

Related PR: #3799 #3769

@izchen
Copy link
Contributor Author

izchen commented Jan 4, 2022

cc @rdblue @wypoon

@@ -244,6 +244,7 @@ private boolean requiresRewrite(Filter filter, Schema schema, Set<Integer> ident

@Override
public void deleteWhere(Filter[] filters) {
canDeleteWhere(filters);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where the check is done for the other versions of Spark, but adding canDeleteWhere here is expensive and not the right solution. Can you add the precondition instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rdblue Thanks for the review!

Comment on lines +247 to +250
Preconditions.checkArgument(
snapshotId == null,
"Cannot delete from table at a specific snapshot: %s", snapshotId);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please remove this same check from canDeleteWhere above? That way, the check is present in only one place, the place where it is actually called.
IIUC, Spark 3.0 doesn't actually call SparkTable#canDeleteWhere (Spark 3.1 does); we have canDeleteWhere in the code here because it was added when we had a single code base for both Spark 3.0 and 3.1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that the implementation of canDeleteWhere matters much for Spark 3.0. We could simply remove the function since it is unused, but it doesn't seem worth letting the code drift either to remove it or to modify it to have this check in just one place. So I guess I disagree with this suggestion. Let's just leave it as is.

@wypoon
Copy link
Contributor

wypoon commented Jan 4, 2022

Apart from one comment, LGTM.
@izchen thanks for finding and fixing this!

@rdblue rdblue merged commit 051af40 into apache:master Jan 4, 2022
@izchen izchen deleted the fix_delete_from_snapshot_3_0 branch January 5, 2022 02:31
@izchen
Copy link
Contributor Author

izchen commented Jan 5, 2022

Thank you everyone!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants