Skip to content

Remove dict_id from arrow_schema::field::Field and make dictionary IDs an internal implementation detail of flight encoding/decoding #5981

Open
@thinkharderdev

Description

@thinkharderdev

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

Currently the dict_id field is only used for the purposes of arrow flight encoding/decoding so dictionaries can be mapped to there associated fields.

This is annoying and error-prone as the user is left the responsibility of assigning these dictionary IDs and ensuring that they are unique.

#5971 adds the option to auto-assign dictionary IDs during arrow flight encoding. This can be enabled by setting the preserve_dict_id option in IpcWriteOptions to false (current default is true

Describe the solution you'd like

This can be done in stages but ultimately would like to

  1. Make preserve_dict_id default to false
  2. Remove the preserve_dict_id option altogether
  3. Remove the dict_id field from arrow_schema::schema::Field entirely as it no longer has any purpose

Describe alternatives you've considered

We can leave this is as a configurable option and either only do 1 above or we can leave auto-assigning of dictionary IDs as an opt-in feature

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementAny new improvement worthy of a entry in the changelog

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions