Skip to content

Conversation

etseidl
Copy link
Contributor

@etseidl etseidl commented Oct 29, 2024

Which issue does this PR close?

Part of #6447. Also see #6582.

Rationale for this change

The behavior of ParquetMetaDataReader when requesting page indexes differs between synchronous and asynchronous implementations. For historical reasons, the synchronous methods currently return empty vectors for the ColumnIndex and OffsetIndex when page indexes are requested but not present in the file. The asynchronous methods instead return None in that case.

What changes are included in this PR?

This PR changes the behavior of ParquetMetaDataReader to always return None when page indexes are requested but not present. It also changes the behavior and signatures of the legacy functions read_columns_indexes and read_offset_indexes. These will now return optional vectors set to None rather than empty vectors when page indexes are not present.

Are there any user-facing changes?

Yes, as noted above.

@github-actions github-actions bot added the parquet Changes to the parquet crate label Oct 29, 2024
@tustvold tustvold added api-change Changes to the arrow API next-major-release the PR has API changes and it waiting on the next major version labels Oct 29, 2024
@tustvold tustvold merged commit 73a0c26 into apache:main Nov 24, 2024
16 checks passed
@etseidl etseidl deleted the missing_page_index branch November 25, 2024 23:54
alamb pushed a commit that referenced this pull request Jun 11, 2025
# Which issue does this PR close?
Related to #6447.

While reviewing other PRs I happened to notice an old FIXME I left
behind that should have been removed in #6639.

# Rationale for this change


# What changes are included in this PR?


# Are there any user-facing changes?
No, just removes a comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-change Changes to the arrow API next-major-release the PR has API changes and it waiting on the next major version parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants