Skip to content

Conversation

@ayushdg
Copy link
Contributor

@ayushdg ayushdg commented Mar 30, 2023

Which issue does this PR close?

Closes #5626

Rationale for this change

Some part of the codebase use sqlToRel's options for enable_ident_normalization to decide whether to normalize identifiers or not, but it isn't enforced consistently throughout the codebase.

What changes are included in this PR?

Make planner:normalize_ident public.
All methods implementing SqlToRel use enable_ident_normalization while normalizing.

Are these changes tested?

  • Add tests

Are there any user-facing changes?

No.

@github-actions github-actions bot added the sql SQL Planner label Mar 30, 2023
@ayushdg ayushdg marked this pull request as ready for review March 30, 2023 18:51
@ayushdg
Copy link
Contributor Author

ayushdg commented Mar 30, 2023

While working on this PR, I also noticed that there are places not related to the planner which also normalize names such as column. I wonder if it's valuable to also make these method configurable to enable ident normalization?

https://github.com/apache/arrow-datafusion/blob/c09edade14d456f1d2161c5ebb8c1e51e592a8ef/datafusion/common/src/column.rs#L73

@Jefffrey
Copy link
Contributor

While working on this PR, I also noticed that there are places not related to the planner which also normalize names such as column. I wonder if it's valuable to also make these method configurable to enable ident normalization?

https://github.com/apache/arrow-datafusion/blob/c09edade14d456f1d2161c5ebb8c1e51e592a8ef/datafusion/common/src/column.rs#L73

I had some thoughts about this previously (not this specific function, but still applicable): #5183 (comment)

Generally I'm not sure how exactly to be able to toggle normalization at that level (the common crate) without API changes to dependent functions, like from_qualified_name(...) in Column as you've highlighted

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @ayushdg

I have some suggestions on how maybe to make this code better, but I also think what is here is better than what is on master so it could be merged as is.

}

fn normalize_ident(id: Ident, enable_normalization: bool) -> String {
pub fn normalize_ident(id: Ident, enable_normalization: bool) -> String {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would you think about making crate::utils::normalize_ident pub instead? making something public that only does something if true is passed in seems somewhat strange to me

@alamb
Copy link
Contributor

alamb commented Apr 2, 2023

@ayushdg please let me know if you would to make changes to this PR or if we should merge it as is

@ayushdg ayushdg marked this pull request as draft April 3, 2023 06:51
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In cases like this with helper methods that get called from SqlToRel impl methods is there a better approach than adding a normalizer argument to each of these methods?
https://github.com/apache/arrow-datafusion/blob/c5e935a15f22d16d8ca244e68b128f7c5ff28375/datafusion/sql/src/statement.rs#L54-L65

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't think of any at the moment

@ayushdg ayushdg marked this pull request as ready for review April 6, 2023 22:09
@ayushdg ayushdg requested a review from alamb April 6, 2023 22:09
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me -- thank you @ayushdg

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't think of any at the moment

@ayushdg
Copy link
Contributor Author

ayushdg commented Apr 10, 2023

Thanks @alamb let me know if anything else is needed, or if the pr is good to merge as is.

@alamb alamb merged commit 570a1cf into apache:main Apr 10, 2023
korowa pushed a commit to korowa/arrow-datafusion that referenced this pull request Apr 13, 2023
…led (apache#5785)

* Make planner::normalize_ident public and refactor planner impl method to honor planner options

* Undo pre-commit config change

* Add tests

* Add identNormalizer struct to sqlToRel and move calls to normalize_ident to the struct implementation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

sql SQL Planner

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SqltoRel doesn't honor enable_ident_normalization some cases

3 participants