Skip to content

Improve schema merging #4223

@andygrove

Description

@andygrove

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

I am trying to work with the nyctaxi parquet data set which has one file per month. Over time, some of the types have changed. For example passenger_count started out as Int64 and was later changed to Float64.

Arrow-rs can not merge these schemas.

Other solutions (such as DuckDB) will merge these schemas and pick the least restrictive type (Float64).

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions