[Fix][Connector-V2] Fix Paimon nested row write#10888
Conversation
f497ac5 to
ec3f4a4
Compare
DanielLeens
left a comment
There was a problem hiding this comment.
Thanks for the contribution. I went through the latest head and traced the nested-row conversion path into the Paimon sink. I do not see a source-level blocker in the current implementation; the remaining gate is still the red CI.
What this PR solves
- User pain point
NestedROWvalues can fail to write into the Paimon sink when recursive conversion loses the sink-side nested schema or nested field names. - Fix approach
This PR makes nested-row recursion use the sink-side nested schema explicitly and preserves nested field names when converting SeaTunnel row types back to Paimon row types. - One-line summary
This is a focused fix for the nested-row write path into Paimon, and I do not see a source-level blocker on the latest head.
Runtime path I checked
PaimonSinkWriter.write(...) [PaimonSinkWriter.java:230-233]
-> RowConverter.reconvert(element, seaTunnelRowType, sinkPaimonTableSchema)
-> iterate fields [RowConverter.java:456-493]
-> ROW branch
-> fetch target RowType from sink schema
-> recursively reconvert nested SeaTunnelRow using paimonRowType.getFields()
-> build InternalRow
Key findings
- The normal Paimon sink write path definitely hits this logic.
- The fix is targeted at the real nested-row boundary, not just a superficial wrapper.
- The new tests cover success, field-count mismatch, incompatible nested schema, field-name mismatch, null nested row, and nested precision handling (
RowConverterTest.java:372-515,RowTypeConverterTest.java:235-248). - I do not see a new source-level blocker on the latest head.
Test / CI notes
- The new tests are in-memory schema/row conversion tests and look structurally stable.
- The latest contributor
Buildis still red, including a failedunit-test (11, ubuntu-latest)lane and follow-on cancelled/skipped matrix jobs, so CI still needs to be resolved before merge.
Conclusion: can merge after fixes
- Blocking items
- Please get the latest
Buildgreen before merge.
- Suggested but non-blocking improvements
- No new source-side changes are required from my side on this revision.
Overall, the nested-row sink-schema propagation looks correct on the latest head, and I do not have a source-level blocker for this revision. The remaining gate is CI.
|
@QuakeWang good job, Please rebase the latest dev branch as it has resolved the issue of unit test ci failure |
ec3f4a4 to
e3bf8ff
Compare
@nzw921rx I rebased this PR onto the latest dev and reran CI. The current failure is in This looks unrelated to the Paimon changes in this PR. It seems to be exposed after the latest dev added |
Thank you, I have fixed the PR I submitted, and once it is merged into dev, it can be rebased |
DanielLeens
left a comment
There was a problem hiding this comment.
Thanks for the rebase update and for checking the current CI failure path. I re-reviewed the latest head from scratch against upstream/dev and retraced the nested-row conversion path into the Paimon sink again.
What This PR Solves
- User pain
NestedROWvalues can fail on the normal Paimon sink write path when recursive conversion validates nested data against the top-level schema instead of the sink-side nested row schema. - Fix approach
The current head makes recursive conversion use the current nested PaimonRowTypefields directly and preserves nested field names when converting the SeaTunnel row type back to Paimon row type. - One-line summary
I still do not see a source-level blocker in the latest Paimon conversion logic; the remaining red CI is outside this PR's touched path.
Runtime path I checked
Paimon sink write
-> PaimonSinkWriter.write(...)
-> RowConverter.reconvert(element, seaTunnelRowType, sinkTableSchema)
-> iterate fields
-> nested ROW branch
-> fetch nested target RowType from sink schema
-> recursively reconvert nested SeaTunnelRow using paimonRowType.getFields()
-> build InternalRow for the sink row
Findings
- The normal Paimon sink write path definitely hits this logic, so the fix is on the real runtime boundary.
- The current implementation correctly switches recursive nested-row validation away from the top-level schema.
- The added tests cover the main correctness cases here: nested success, null nested row, field-count mismatch, field-name/type mismatch, and nested decimal/timestamp precision handling.
- I do not see a new source-level blocker on the latest head.
Compatibility / Side Effects
- No new config key or protocol contract is introduced.
- The change is a focused correctness fix for the existing nested-row write path.
- I do not see a new concurrency or lifecycle risk in this patch.
Tests / CI
- The new tests are deterministic in-memory conversion tests and do not show an obvious flaky pattern.
- The latest GitHub
Buildis still red, but the visible failure isConnectorSpecificationCheckTest.testAllConnectorImplementFactoryWithUpToDateMethod, specifically becauseTypesenseSinkFactoryis missingSinkCommonOptions.MULTI_TABLE_SINK_REPLICAin itsoptionRule(). That failure is outside the touched Paimon files and matches the separate Typesense fix being discussed in#10898.
Conclusion: can merge after fixes
- Blocking items
- Please get the latest
Buildgreen before merge. Based on the current logs, that means syncing the TypesenseoptionRule()fix into the branch once the equivalent change is available ondev.
- Suggested but non-blocking follow-up
- No new source-side changes are required from my side on this revision.
From the Paimon source side, the nested-row fix looks good on the latest head. The remaining blocker is the unrelated CI failure, not the conversion logic in this PR.
Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>
Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>
e3bf8ff to
610e257
Compare
DanielLeens
left a comment
There was a problem hiding this comment.
Thanks for updating this. I re-reviewed the latest head and the current check state before writing this, and this is a source-level rereview of the current revision.
What this PR fixes
- User pain: nested
ROWvalues in the Paimon sink could be converted with the wrong nested schema context, which breaks the real nested-row write path. - Fix approach: the nested
ROWbranch now looks up the sink-side nestedRowTypedirectly, andRowTypeConverternow uses the inner field names instead of reusing the outer field name all the way down. - In plain terms: this makes nested row conversion follow the real sink schema instead of the outer row context.
Runtime chain
Paimon sink writer
-> RowConverter.reconvert(row, sourceType, sinkSchema)
-> field sqlType == ROW
-> lookup nested RowType from sink schema
-> recursively convert nested SeaTunnelRow with nested sink fields
-> write nested InternalRow with InternalRowSerializer(nestedRowType)
Findings
- On the current head, the normal write path really does hit this code when a sink row contains a nested
ROWfield. - The fix is precise: it corrects the nested schema source instead of adding a defensive special case.
- The new unit coverage is good. I checked the success path, field-count mismatch, field-type mismatch, field-name mismatch, null nested row, and nested timestamp/decimal precision cases.
- From the current source-level pass, I did not find a blocker in the nested-row fix itself.
Merge conclusion: can merge
- Blocking items
- No blocking source-level issue found in the current nested-row fix.
- Suggested follow-up
- The current Build is still red, but the failing job is in
seatunnel-engine-client(SeaTunnelClientTest.testGetMultiTableJobMetrics) rather than the touched Paimon path. I refreshedupstream/devbefore this pass and I do not see a newer engine-client-side fix to pick up there, so this does not look like a simple “rebase and it goes away” case.
Overall, the nested-row conversion fix itself looks good to me on the latest head. Once the branch-level CI situation is clarified, I do not see a source blocker from this diff.
Purpose of this pull request
close: #10886.
Fix Paimon sink writing for non-null nested
ROWfields. Previously, nested rows were validated against the top-level Paimon table schema, causing field count or schema mismatch errors.This PR makes
RowConverteruse the current nested PaimonRowTypeduring recursive conversion, and preserves nestedROWfield names during type conversion.Does this PR introduce any user-facing change?
Yes. Paimon sink can now write nested
ROWfields correctly.No new config option or incompatible behavior is introduced.
How was this patch tested?
Added unit tests for nested
ROWsuccess, null nestedROW, field count mismatch, field name/type mismatch, and decimal/timestamp precision.Check list
New License Guide
incompatible-changes.mdto describe the incompatibility caused by this PR.