Skip to content

Conversation

etseidl
Copy link
Contributor

@etseidl etseidl commented Sep 23, 2025

Which issue does this PR close?

Note: this targets a feature branch, not main

Rationale for this change

Continues the remodel by implementing writing of the page index structures.

What changes are included in this PR?

This PR removes the old parquet::file::page_index::Index enum and replaces with the new ColumnIndexMetaData struct.

Are these changes tested?

Covered by existing tests

Are there any user-facing changes?

Yes.

@github-actions github-actions bot added the parquet Changes to the parquet crate label Sep 23, 2025
@etseidl etseidl added the api-change Changes to the arrow API label Sep 23, 2025
Some(column_index.to_thrift())
}
})
.map(|column_index| Some(column_index.clone()))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This clone is unfortunate. We define the page and offset indexes as Vec<Vec<index>>, but they start out as Vec<Vec<Option<index>>> and ultimately return to that form. It would be nice to keep that consistent. Once the remodel is finished we can revisit this.

@mbrobbel mbrobbel added this to the 57.0.0 milestone Sep 25, 2025
@etseidl
Copy link
Contributor Author

etseidl commented Sep 25, 2025

Thanks @mbrobbel!

@alamb I'm going to forge ahead. Only two more major PRs to go until the breaking changes stop. 🤞

@etseidl etseidl merged commit b0cc254 into apache:gh5854_thrift_remodel Sep 25, 2025
16 checks passed
@alamb
Copy link
Contributor

alamb commented Sep 25, 2025

gogogogogogo

Sorry I can't review these PRs fast enough. I am sure they are great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-change Changes to the arrow API parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants