Skip to content

Add builder style API for manipulating ParquetMetaData #6465

@alamb

Description

@alamb

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

As part of #6002, @adriangb @etseidl and myself are working to improve the APIs to work with ParquetMetaData

The main usecase is to store this metadata "out of band" (aka somewhere that is not interleaved with the Parquet data itself). Part of storing such metadata often involves modifying existing ParquetMetaData before restoring

For example, one might want to remove the page index structures to save space.

At the moment it is awkward to modify ParquetMetaData (you have to re-create it from its constitutent fields, and there is no way to avoid clone'ing)

Describe the solution you'd like
I would like some API to modify a ParquetMetaData

Describe alternatives you've considered
I propose a ParquetMetaDataBuilder that follows the model of RowGroupMetaDataBuilder

I added a simple one as part of trying to write a test in: #6463 which I plan to propose as a real API

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementAny new improvement worthy of a entry in the changelogparquetChanges to the parquet crate

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions