[Parquet] Split `ParquetMetadataReader` into IO/decoder state machine and thrift parsing

**Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
- Related to https://github.com/apache/arrow-rs/issues/8000
- Similar to https://github.com/apache/arrow-rs/issues/7983, but for the metadata reader

The current `ParquetMetadataReader` intermixes three things:
1. The state machine for decoding parquet metadata (footer, then metadata, then (optional) indexes) 
2. orchestrating IO (aka calling read, etc)
3. Decoding thrift encoded byte into objects

This makes it almost impossible to add features like "only decode a subset of the columns in the ColumnIndex" and other potentially advanced usecases



**Describe the solution you'd like**

Now that we have a "push" style API for metadata decoding that avoids IO, I would like to separate out these three parts so that we can add better features


**Describe alternatives you've considered**


**Additional context**

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Parquet] Split `ParquetMetadataReader` into IO/decoder state machine and thrift parsing #8439

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Parquet] Split ParquetMetadataReader into IO/decoder state machine and thrift parsing #8439

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[Parquet] Split `ParquetMetadataReader` into IO/decoder state machine and thrift parsing #8439