Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Spark] Rename FileNames.deltaFile to FileNames.unsafeDeltaFile #2838

Merged
merged 2 commits into from
Apr 10, 2024

Conversation

sumeet-db
Copy link
Collaborator

@sumeet-db sumeet-db commented Apr 2, 2024

Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • Other (fill in here)

Description

Previously, certain code paths assumed the existence of delta files for a specific version at a predictable path _delta_log/$version.json. This assumption is no longer valid with managed-commits, where delta files may alternatively be located at _delta_log/_commits/$version.$uuid.json. We explicitly rename the old method to unsafeDeltaFile to warn future users about it being incorrect for tables with Managed Commits.

To not break dependent systems, plan:

  1. Update all delta-spark usages to use the unsafe method. (current PR)
  2. Deprecate the deltaFile method. (current PR)
  3. Remove the deprecated method once it's proven to be safe. (future PR)

How was this patch tested?

UTs

Does this PR introduce any user-facing changes?

No

@sumeet-db sumeet-db requested a review from prakharjain09 April 2, 2024 05:59
@sumeet-db sumeet-db force-pushed the rename-1 branch 3 times, most recently from dafac6b to 4f0671c Compare April 3, 2024 18:11
@scottsand-db scottsand-db merged commit 0fe578b into delta-io:master Apr 10, 2024
7 checks passed
andreaschat-db pushed a commit to andreaschat-db/delta that referenced this pull request Apr 16, 2024
…a-io#2838)

#### Which Delta project/connector is this regarding?

- [x] Spark
- [ ] Standalone
- [ ] Flink
- [ ] Kernel
- [ ] Other (fill in here)

## Description

Previously, certain code paths assumed the existence of delta files for
a specific version at a predictable path `_delta_log/$version.json`.
This assumption is no longer valid with managed-commits, where delta
files may alternatively be located at
`_delta_log/_commits/$version.$uuid.json`. We explicitly rename the old
method to `unsafeDeltaFile` to warn future users about it being
incorrect for tables with Managed Commits.

To not break dependent systems, plan:
1. Update all delta-spark usages to use the unsafe method. (current PR)
2. Deprecate the deltaFile method. (current PR)
3. Remove the deprecated method once it's proven to be safe. (future PR)

## How was this patch tested?

UTs

## Does this PR introduce _any_ user-facing changes?

No
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants