-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Closed
Labels
arrowChanges to the arrow crateChanges to the arrow crateenhancementAny new improvement worthy of a entry in the changelogAny new improvement worthy of a entry in the changelogparquetChanges to the parquet crateChanges to the parquet crate
Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Currently we only provide APIs for projecting root columns of an arrow schema, that is there is no easy mechanism to project parts of nested columns. This is despite both the arrow-json reader and parquet readers supporting this.
This lack of support has come up a few times recently:
- Parquet: clear metadata and project fields of ParquetRecordBatchStream::schema #5135 (comment)
- Extract parquet statistics to its own module, add tests datafusion#8294
- Introduce ProjectionMask To Allow Nested Projection Pushdown datafusion#2581
Describe the solution you'd like
I would like an API that makes it easy to perform nested projection on a schema
Describe alternatives you've considered
Additional context
Metadata
Metadata
Assignees
Labels
arrowChanges to the arrow crateChanges to the arrow crateenhancementAny new improvement worthy of a entry in the changelogAny new improvement worthy of a entry in the changelogparquetChanges to the parquet crateChanges to the parquet crate