Skip to content

[Fix][Connectors-v2] Avoid Iceberg DROP_DATA full-table delete commit#10851

Open
corgy-w wants to merge 3 commits into
apache:devfrom
corgy-w:corgy/dev-iceberg-drop-data-recreate
Open

[Fix][Connectors-v2] Avoid Iceberg DROP_DATA full-table delete commit#10851
corgy-w wants to merge 3 commits into
apache:devfrom
corgy-w:corgy/dev-iceberg-drop-data-recreate

Conversation

@corgy-w
Copy link
Copy Markdown
Contributor

@corgy-w corgy-w commented May 7, 2026

Purpose of this pull request

Iceberg DROP_DATA previously used a full-table delete commit. That path can load metadata, snapshots, and manifests aggressively enough to OOM large tables before writes start.

A previous attempt to always switch DROP_DATA to metadata reset avoided that path, but it also regressed the existing iceberg.table.commit-branch capability by rejecting tables with non-main refs.

This patch adds an explicit Iceberg-only iceberg.drop-data.strategy option with two modes:

  • DELETE_COMMIT (default): keeps the historical delete-commit behavior and targets iceberg.table.commit-branch when configured
  • HARD_METADATA_RESET: removes all snapshot refs and snapshots through TableOperations.commit, then recreates the configured non-main commit branch if needed

The patch also removes the redundant instanceof IcebergCatalog guard from the sink save-mode path, fixes the Iceberg catalog factory error text, updates the Iceberg sink docs, and adds unit coverage for both strategies.

Does this PR introduce any user-facing change?

Yes. Iceberg sink users can now choose iceberg.drop-data.strategy.

The default remains DELETE_COMMIT, so existing DROP_DATA jobs keep their historical behavior, including branch-targeted delete commits when iceberg.table.commit-branch is configured.

Users who explicitly choose HARD_METADATA_RESET get a table-wide metadata reset that clears all refs and snapshots, then recreates the configured non-main commit branch. Orphan file cleanup is still left to normal Iceberg maintenance.

How was this patch tested?

  • Ran JAVA_HOME=/Users/coolcorgy/Library/Java/JavaVirtualMachines/azul-17.0.11/Contents/Home PATH=/Users/coolcorgy/Library/Java/JavaVirtualMachines/azul-17.0.11/Contents/Home/bin:$PATH ./mvnw -o -nsu -pl seatunnel-connectors-v2/connector-iceberg spotless:apply
  • Updated IcebergCatalogMetadataResetTest and IcebergSaveModeHandlerTest to cover both strategies and branch handling

@corgy-w corgy-w force-pushed the corgy/dev-iceberg-drop-data-recreate branch from 2dc9413 to d37f651 Compare May 7, 2026 09:20
Copy link
Copy Markdown
Contributor

@davidzollo davidzollo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution. I reviewed the full current head locally on seatunnel-review-10851 against upstream/dev. This was a source-level review only; I did not run local Maven in this round.

What This PR Fixes

  • User pain: DROP_DATA on Iceberg should avoid a heavy full-table delete commit when a safer metadata-level reset is possible.
  • Fix approach: keepSchemaDropData() now routes into truncateTable(), and truncateTable() replaces the old deleteFromRowFilter(alwaysTrue()) path with a metadata reset.
  • One-line summary: the metadata-reset direction is good for plain tables, but the current head introduces a new hard failure for tables that have non-main snapshot refs, which is a visible behavior narrowing compared with the old path.

1. Code Review

1.1 Core Logic Analysis

  • Exact change scope:

    • seatunnel-connectors-v2/connector-iceberg/src/main/java/org/apache/seatunnel/connectors/seatunnel/iceberg/sink/IcebergSaveModeHandler.java:43-51
    • seatunnel-connectors-v2/connector-iceberg/src/main/java/org/apache/seatunnel/connectors/seatunnel/iceberg/catalog/IcebergCatalog.java:276-314
  • Before / after:

// older truncate path
catalog.loadTable(icebergTableIdentifier)
        .newDelete()
        .deleteFromRowFilter(Expressions.alwaysTrue())
        .commit();
// current path
Set<String> nonMainRefs = table.refs().keySet().stream()
        .filter(refName -> !SnapshotRef.MAIN_BRANCH.equals(refName))
        .collect(Collectors.toCollection(LinkedHashSet::new));
if (!nonMainRefs.isEmpty()) {
    throw new CatalogException(...);
}
// then perform metadata reset
  • Key findings:

    1. The normal KEEP_SCHEMA + DROP_DATA runtime path definitely hits this change.
    2. The metadata-reset direction itself is reasonable for plain tables.
    3. The new implementation now rejects any table that has non-main snapshot refs.
    4. That is a real behavior narrowing compared with the old path, not just an internal refactor.
  • Runtime chain I checked:

save mode path
  -> DefaultSaveModeHandler.keepSchemaDropData()
      -> IcebergSaveModeHandler.keepSchemaDropData()
          -> icebergCatalog.truncateTable(tablePath, true)
              -> loadTable(...)
              -> inspect table.refs()
              -> if non-main refs exist -> throw CatalogException
              -> else perform metadata reset commit

1.2 Compatibility Impact

  • Verdict: partially incompatible.
  • API/config/default/protocol/serialization are unchanged.
  • The incompatible part is runtime behavior: a table that previously could still be cleared through the older path is now rejected immediately if it has non-main refs.

1.3 Performance / Side Effects

  • For plain tables, the metadata-reset approach is a reasonable performance win.
  • The main side effect is compatibility-related: the supported table shape is narrower than before.

1.4 Error Handling and Logs

Issue 1: Tables with non-main snapshot refs are now rejected outright

  • Location: IcebergCatalog.java:285
  • Why this is a real problem:
    this code sits directly on the real KEEP_SCHEMA + DROP_DATA path. If the table has non-main refs, the PR now turns that into a hard failure instead of falling back to a behavior-compatible truncation path.
  • Risk:
    users with Iceberg tag/branch usage can hit a main-path failure after upgrading to this implementation.
  • Suggested fix:
    if non-main refs are present, fall back to the older delete-based truncation path instead of failing the operation outright. If you intentionally want to keep the new limitation, it needs to be documented explicitly as an incompatible behavior change.
  • Severity: High

2. Code Quality

  • The implementation is cleanly structured.
  • I would still add one contract-style test around tables with non-main refs, because that is exactly where the current compatibility regression sits.

3. Architecture

  • For plain tables this is a good direction.
  • The blocker is that the current head narrows the supported table shape without a compatibility-safe fallback.

4. Issue Summary

No. Issue Location Severity
1 Tables with non-main snapshot refs are now rejected outright IcebergCatalog.java:285 High

5. Merge Conclusion

Conclusion: merge after fixes

  1. Blocking items
  • Issue 1 must be fixed before merge.
  1. Suggested non-blocking follow-up
  • Add one contract test around the non-main-ref path so this boundary stays explicit.

The metadata-reset idea is worth keeping. The blocker is the new compatibility regression on tagged/branched Iceberg tables, not the overall direction.

@davidzollo davidzollo dismissed their stale review May 7, 2026 14:35

Posted from the wrong account by mistake; superseded by the DanielLeens review.

Copy link
Copy Markdown
Contributor

@DanielLeens DanielLeens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution. I reviewed the full current head locally on seatunnel-review-10851 against upstream/dev. This was a source-level review only; I did not run local Maven in this round.

What This PR Fixes

  • User pain: DROP_DATA on Iceberg should avoid a heavy full-table delete commit when a safer metadata-level reset is possible.
  • Fix approach: keepSchemaDropData() now routes into truncateTable(), and truncateTable() replaces the old deleteFromRowFilter(alwaysTrue()) path with a metadata reset.
  • One-line summary: the metadata-reset direction is good for plain tables, but the current head introduces a new hard failure for tables that have non-main snapshot refs, which is a visible behavior narrowing compared with the old path.

1. Code Review

1.1 Core Logic Analysis

  • Exact change scope:

    • seatunnel-connectors-v2/connector-iceberg/src/main/java/org/apache/seatunnel/connectors/seatunnel/iceberg/sink/IcebergSaveModeHandler.java:43-51
    • seatunnel-connectors-v2/connector-iceberg/src/main/java/org/apache/seatunnel/connectors/seatunnel/iceberg/catalog/IcebergCatalog.java:276-314
  • Before / after:

// older truncate path
catalog.loadTable(icebergTableIdentifier)
        .newDelete()
        .deleteFromRowFilter(Expressions.alwaysTrue())
        .commit();
// current path
Set<String> nonMainRefs = table.refs().keySet().stream()
        .filter(refName -> !SnapshotRef.MAIN_BRANCH.equals(refName))
        .collect(Collectors.toCollection(LinkedHashSet::new));
if (!nonMainRefs.isEmpty()) {
    throw new CatalogException(...);
}
// then perform metadata reset
  • Key findings:

    1. The normal KEEP_SCHEMA + DROP_DATA runtime path definitely hits this change.
    2. The metadata-reset direction itself is reasonable for plain tables.
    3. The new implementation now rejects any table that has non-main snapshot refs.
    4. That is a real behavior narrowing compared with the old path, not just an internal refactor.
  • Runtime chain I checked:

save mode path
  -> DefaultSaveModeHandler.keepSchemaDropData()
      -> IcebergSaveModeHandler.keepSchemaDropData()
          -> icebergCatalog.truncateTable(tablePath, true)
              -> loadTable(...)
              -> inspect table.refs()
              -> if non-main refs exist -> throw CatalogException
              -> else perform metadata reset commit

1.2 Compatibility Impact

  • Verdict: partially incompatible.
  • API/config/default/protocol/serialization are unchanged.
  • The incompatible part is runtime behavior: a table that previously could still be cleared through the older path is now rejected immediately if it has non-main refs.

1.3 Performance / Side Effects

  • For plain tables, the metadata-reset approach is a reasonable performance win.
  • The main side effect is compatibility-related: the supported table shape is narrower than before.

1.4 Error Handling and Logs

Issue 1: Tables with non-main snapshot refs are now rejected outright

  • Location: IcebergCatalog.java:285
  • Why this is a real problem:
    this code sits directly on the real KEEP_SCHEMA + DROP_DATA path. If the table has non-main refs, the PR now turns that into a hard failure instead of falling back to a behavior-compatible truncation path.
  • Risk:
    users with Iceberg tag/branch usage can hit a main-path failure after upgrading to this implementation.
  • Suggested fix:
    if non-main refs are present, fall back to the older delete-based truncation path instead of failing the operation outright. If you intentionally want to keep the new limitation, it needs to be documented explicitly as an incompatible behavior change.
  • Severity: High

2. Code Quality

  • The implementation is cleanly structured.
  • I would still add one contract-style test around tables with non-main refs, because that is exactly where the current compatibility regression sits.

3. Architecture

  • For plain tables this is a good direction.
  • The blocker is that the current head narrows the supported table shape without a compatibility-safe fallback.

4. Issue Summary

No. Issue Location Severity
1 Tables with non-main snapshot refs are now rejected outright IcebergCatalog.java:285 High

5. Merge Conclusion

Conclusion: merge after fixes

  1. Blocking items
  • Issue 1 must be fixed before merge.
  1. Suggested non-blocking follow-up
  • Add one contract test around the non-main-ref path so this boundary stays explicit.

The metadata-reset idea is worth keeping. The blocker is the new compatibility regression on tagged/branched Iceberg tables, not the overall direction.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.conditional(
IcebergSinkOptions.DATA_SAVE_MODE,
DataSaveMode.DROP_DATA,
IcebergDropDataStrategy)

Copy link
Copy Markdown
Contributor

@DanielLeens DanielLeens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update. I re-reviewed the latest head locally against the current upstream/dev. This was a source-level review only; I did not run local Maven in this round.

What This PR Fixes

  • User pain: data_save_mode = DROP_DATA on Iceberg can be expensive when it always goes through a full-table delete commit.
  • Fix approach: introduce an explicit iceberg.drop-data.strategy; keep DELETE_COMMIT as the default for compatibility, and add an opt-in HARD_METADATA_RESET path plus branch / docs / tests.
  • One-line summary: the current head resolves the compatibility blocker from my previous review by restoring the legacy default behavior and making metadata reset an explicit opt-in strategy.

Runtime Chain I Checked

SaveMode path
  -> IcebergSink.getSaveModeHandler()
      -> new IcebergSaveModeHandler(..., dropDataStrategy, commitBranch)
      -> handleDataSaveMode()
          -> keepSchemaDropData()
              -> IcebergCatalog.truncateTable(tablePath, true, strategy, commitBranch)
                  -> strategy = DELETE_COMMIT
                       -> table.newDelete()
                       -> optional toBranch(commitBranch)
                       -> deleteFromRowFilter(alwaysTrue())
                       -> commit()
                  -> strategy = HARD_METADATA_RESET
                       -> TableOperations.current()
                       -> remove all refs + snapshots from metadata
                       -> operations.commit(base, resetMetadata)
                       -> recreateCommitBranchIfNeeded(non-main branch only)

Key Findings

  1. The normal data_save_mode = DROP_DATA path definitely hits this change.
  2. The default strategy is now DELETE_COMMIT, so existing jobs keep the legacy behavior.
  3. The destructive metadata-reset path is opt-in, and both EN/ZH docs now say explicitly that it clears all snapshot refs / snapshots and leaves orphan cleanup to separate Iceberg maintenance.
  4. The blocker from my previous review is resolved: the PR no longer hard-fails tagged / branched tables on the default path.

I also checked the upstream Iceberg ManageSnapshots#createBranch(String) contract from the published source jar. When the current snapshot is null, Iceberg creates a new empty snapshot for that branch instead of failing, so the non-main branch recreation logic is valid at the API-contract level.

No blocking issue found.

Conclusion: can merge

  1. Blocking items
  • None.
  1. Suggested non-blocking follow-up
  • If this area evolves further, one more integration-style check around the first write after a hard metadata reset would still be useful.

Overall, this version fixes the compatibility problem from the last round in the right way: metadata reset remains available, but it is now behind an explicit opt-in knob and the default path stays backward-compatible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants