Skip to content

Support creating Delta tables with column mapping mode #17565

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 5, 2023

Conversation

ebyhr
Copy link
Member

@ebyhr ebyhr commented May 19, 2023

Description

Support creating Delta tables with column mapping mode
Fixes #12638

Release notes

(x) Release notes are required, with the following suggested text:

# Delta Lake
* Add support for creating tables with column mapping mode. ({issue}`12638`)

@cla-bot cla-bot bot added the cla-signed label May 19, 2023
@github-actions github-actions bot added delta-lake Delta Lake connector docs tests:hive labels May 19, 2023
@ebyhr ebyhr mentioned this pull request May 19, 2023
10 tasks
@ebyhr ebyhr self-assigned this May 19, 2023
@ebyhr ebyhr force-pushed the ebi/delta-create-table-column-mapping-mode branch 3 times, most recently from 23d28c4 to bacd0ba Compare May 22, 2023 05:00
@ebyhr ebyhr marked this pull request as ready for review May 22, 2023 07:36
@ebyhr ebyhr force-pushed the ebi/delta-create-table-column-mapping-mode branch 2 times, most recently from 9de4af1 to a36356e Compare June 6, 2023 04:47
@findinpath findinpath self-requested a review June 7, 2023 09:13
Copy link
Contributor

@findinpath findinpath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % UX failure messages

@ebyhr ebyhr force-pushed the ebi/delta-create-table-column-mapping-mode branch from a36356e to c55b7a4 Compare June 8, 2023 23:14
@ebyhr ebyhr requested a review from findepi June 9, 2023 03:40
Comment on lines +971 to +972
Object serializedType = serializeColumnType(columnMappingMode, fieldId, column.getType());
Type physicalType = deserializeType(typeManager, serializedType, usePhysicalName);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks weird to serializeColumnType only to deserializeType in the next line.
can this be simplified?

also, why serializeColumnType doesn't need usePhysicalName?
the input, column.getType(), doesn't have physical and logical names, so how are .../metadata/delta.columnMapping.physicalName and .../metadata/name populated?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be simplified?

That's possible, but not so simple. Let me handle in follow-up.

also, why serializeColumnType doesn't need usePhysicalName?

The method uses ColumnMappingMode argument instead of the flag. I passed the mode because it affects not only delta.columnMapping.physicalName but also delta.columnMapping.id.

how are .../metadata/delta.columnMapping.physicalName and .../metadata/name populated?

  • delta.columnMapping.physicalName is populated in serializeStructType → generateColumnMetadata in case of id and name mode
  • name is always populated in serializeStructType → serializeStructField

@ebyhr
Copy link
Member Author

ebyhr commented Jul 5, 2023

Ran CI with the internal environment.

@ebyhr ebyhr merged commit c739817 into master Jul 5, 2023
@ebyhr ebyhr deleted the ebi/delta-create-table-column-mapping-mode branch July 5, 2023 11:53
@github-actions github-actions bot added this to the 421 milestone Jul 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

Delta Lake: Writer version 5 support
3 participants