Skip to content

StructArray field names are not unique #9205

@connortsui20

Description

@connortsui20

This is basically a followup of apache/arrow#27083, which was not migrated to this repository.

I think the defaults for the C++ implementation may have changed since that issue was posted (and I have no idea where the Java implementation has this logic).


Here is the existing docs: https://docs.rs/arrow/latest/arrow/array/struct.StructArray.html#method.column_by_name. That documentation comment should probably be updated...

In addition to this, I think there are several issues related to StructArray casting and schema evolution that have not taken this behavior into account. Struct casting might be fine because it only looks at the type of the field and not the name? But I can imagine that schema evolution becomes stranger when you can have a bunch of fields with the same name that have different types.

Also see the related discussion in Vortex, as we have a similar behavior now simply because this is what the Rust Arrow implementation does.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions