Skip to content

Substrait + identifier normalization flags issues #14832

@alamb

Description

@alamb

Describe the bug

This is a report from @lmwnshn and is part of

enable_ident_normalization: I think there may be some extra complications in my situation arising from the use of Substrait (or perhaps Substrait table names are treated as quoted?

In particular, with enable_ident_normalization=false, I cannot register lowercase tablenames and lowercase column names in the parquet files as shown here:

To Reproduce

Please see
https://github.com/lmwnshn/15799-s25-project1-remnants/blob/main/run_datafusion_ident.py#L37

In particular, with enable_ident_normalization=false, I cannot register lowercase tablenames and lowercase column names in the parquet files as shown here:
https://github.com/lmwnshn/15799-s25-project1-remnants/blob/main/run_datafusion_ident.py#L12-L19

So I had to hack the parquet files up a bit
https://github.com/lmwnshn/15799-s25-project1-remnants/blob/main/fix_parquet.py#L11-L23

and switch to uppercased column names + register tables as uppercase
https://github.com/lmwnshn/15799-s25-project1-remnants/blob/main/run_datafusion.py#L12-L19
to get the Substrait plan to execute successfully.

Expected behavior

I expect that the I can register lowercase tablenames and lowercase column names in the parquet files

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions