Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add builder style API for manipulating ParquetMetaData #6465

Closed
alamb opened this issue Sep 26, 2024 · 0 comments · Fixed by #6466
Closed

Add builder style API for manipulating ParquetMetaData #6465

alamb opened this issue Sep 26, 2024 · 0 comments · Fixed by #6466
Labels
enhancement Any new improvement worthy of a entry in the changelog parquet Changes to the parquet crate

Comments

@alamb
Copy link
Contributor

alamb commented Sep 26, 2024

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

As part of #6002, @adriangb @etseidl and myself are working to improve the APIs to work with ParquetMetaData

The main usecase is to store this metadata "out of band" (aka somewhere that is not interleaved with the Parquet data itself). Part of storing such metadata often involves modifying existing ParquetMetaData before restoring

For example, one might want to remove the page index structures to save space.

At the moment it is awkward to modify ParquetMetaData (you have to re-create it from its constitutent fields, and there is no way to avoid clone'ing)

Describe the solution you'd like
I would like some API to modify a ParquetMetaData

Describe alternatives you've considered
I propose a ParquetMetaDataBuilder that follows the model of RowGroupMetaDataBuilder

I added a simple one as part of trying to write a test in: #6463 which I plan to propose as a real API

Additional context

@alamb alamb added parquet Changes to the parquet crate enhancement Any new improvement worthy of a entry in the changelog labels Sep 26, 2024
@alamb alamb changed the title builder style API for manipulating ParquetMetaData Add builder style API for manipulating ParquetMetaData Oct 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Any new improvement worthy of a entry in the changelog parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant