[Improve][connector-jdbc] Remove useless parameters: support_upsert_by_insert_only#10897
[Improve][connector-jdbc] Remove useless parameters: support_upsert_by_insert_only#10897chl-wxp wants to merge 1 commit into
Conversation
|
@chl-wxp Hello, may I ask why this parameter is being cleared? It seems to be an entry that only wants to |
Thank you for your review. When I reviewed the code, I found that the final behavior controlled by the parameter |
@chl-wxp Thank you for your clarification. I still have the following concerns
private static JdbcBatchStatementExecutor<SeaTunnelRow> createUpsertExecutor(
JdbcDialect dialect,
String database,
String table,
TableSchema tableSchema,
TableSchema databaseTableSchema,
String[] pkNames,
TableSchema pkTableSchema,
Function<SeaTunnelRow, SeaTunnelRow> keyExtractor,
boolean enableUpsert,
boolean isPrimaryKeyUpdated,
boolean supportUpsertByInsertOnly) {
if (supportUpsertByInsertOnly) {
return createInsertOnlyExecutor(
dialect, database, table, tableSchema, databaseTableSchema);
}
if (enableUpsert) {
Optional<String> upsertSQL =
dialect.getUpsertStatementByTableSchema(database, table, tableSchema, pkNames);
if (upsertSQL.isPresent()) {
return createSimpleExecutor(
upsertSQL.get(),
tableSchema,
databaseTableSchema,
dialect.getRowConverter());
}
return createInsertOrUpdateByQueryExecutor(
dialect,
database,
table,
tableSchema,
databaseTableSchema,
pkNames,
pkTableSchema,
keyExtractor,
isPrimaryKeyUpdated);
}
return createInsertOrUpdateExecutor(
dialect,
database,
table,
tableSchema,
databaseTableSchema,
pkNames,
isPrimaryKeyUpdated);
} |
DanielLeens
left a comment
There was a problem hiding this comment.
Thanks for working on this cleanup. I reviewed the full current diff against upstream/dev and traced the JDBC sink execution path instead of only reading the option deletion.
What This PR Changes
- User pain
The currentsupport_upsert_by_insert_onlyoption is misleading: it does not implement a safe "insert-only upsert" contract. It switches the executor to plainINSERT, so duplicate primary keys still fail at runtime. - Change approach
This PR removes the option fromJdbcSinkOptions,JdbcSinkConfig,JdbcSinkFactory, andJdbcOutputFormatBuilder, and also removes the Redshift docs reference. - One-line summary
The cleanup direction is understandable, but removing an existing sink config key is still a compatibility change and the current PR does not preserve that contract safely.
Runtime path I checked
JDBC sink config
-> JdbcSinkFactory.optionRule()
-> currently exposes support_upsert_by_insert_only
-> JdbcSinkConfig.of(...)
-> reads support_upsert_by_insert_only into JdbcSinkConfig
Write path
-> JdbcOutputFormatBuilder.createBatchStatementExecutor(...)
-> if supportUpsertByInsertOnly == true
-> createInsertOnlyExecutor(...)
-> always use plain INSERT
-> otherwise
-> normal upsert / insert-update / delete executors
Findings
Issue 1: this PR removes a user-facing JDBC sink option without a compatibility bridge or migration contract
- Location:
seatunnel-connectors-v2/connector-jdbc/src/main/java/org/apache/seatunnel/connectors/seatunnel/jdbc/config/JdbcSinkOptions.java:109-114,.../JdbcSinkConfig.java:41-55,.../JdbcSinkFactory.java:224-231,.../JdbcOutputFormatBuilder.java:175-248 - Why this is a blocker:
Even if the option is semantically weak, it is still part of the current public config surface. Existing jobs that setsupport_upsert_by_insert_onlywill stop being valid once this PR lands, because the key is removed from option exposure and from the runtime path entirely. - Risk:
This is a backward-compatibility break for existing JDBC sink configs. It is especially risky because the PR does not adddocs/en/introduction/concepts/incompatible-changes.mdmigration guidance, and it does not keep a compatibility parse path such as "accept but warn / ignore". - Best fix:
- Option A, preferred: keep the option parsable for one transition period, mark it deprecated, and either warn loudly or treat it as a no-op with explicit migration guidance.
- Option B: if the community wants immediate removal, then this must be handled as an explicit incompatible change with migration documentation telling users what config to use instead.
- Severity: High
Compatibility / Side Effects
- This PR is not fully compatible in its current form, because it removes an existing user-facing config key.
- No API class signature is broken, but the sink configuration contract changes.
- I do not see a new CPU or concurrency risk from the code deletion itself; the blocker is the config-compatibility boundary.
Tests / CI
- I did not do local build execution in this review batch; this is a source-level review only.
- The latest GitHub
Buildis also red, but the visible failure isCoordinatorServiceTest.testGetPendingJobInfo:956 expected: <true> but was: <false>inseatunnel-engine-server, which is outside the touched JDBC files. From current evidence that CI failure does not look caused by this JDBC cleanup.
Conclusion: can merge after fixes
- Blocking items
- Issue 1: preserve the existing config contract with a deprecation bridge, or document the removal as an explicit incompatible change with migration guidance.
- Suggested but non-blocking follow-up
- After the compatibility handling is settled, sync with the latest
devagain if the unrelated engine CI failure is still present.
The main concern here is not that the current option is perfect. The concern is that SeaTunnel should not silently remove an existing sink config surface without a transition contract.
Purpose of this pull request
Remove the useless parameter support_upsert_by_insert_only from JDBC sink connector.
Reasons for removal:
primary key conflicts when the record already exists.
- When primary_keys is not configured: The system uses simple INSERT directly; this parameter is not needed.
- When primary_keys is configured: The upsert mode handles CDC operations (INSERT/UPDATE/DELETE). Setting support_upsert_by_insert_only=true causes UPDATE_AFTER operations to execute INSERT, leading to primary key violation errors.
The correct way to achieve "insert-only without update" behavior is to disable upsert mode, not to use this parameter.