Skip to content

Reset partial record state after skipping all requested records#9374

Open
jonded94 wants to merge 2 commits intoapache:mainfrom
jonded94:reset-partial-record-state-after-skipping-all-records
Open

Reset partial record state after skipping all requested records#9374
jonded94 wants to merge 2 commits intoapache:mainfrom
jonded94:reset-partial-record-state-after-skipping-all-records

Conversation

@jonded94
Copy link
Contributor

@jonded94 jonded94 commented Feb 7, 2026

Which issue does this PR close?

Rationale for this change

The bug occurs when using RowSelection with nested types (like List) when:

  1. A column has multiple pages in a row group
  2. The selected rows span across page boundaries
  3. The first page is entirely consumed during skip operations

The issue was in arrow-rs/parquet/src/column/reader.rs:287-382 (skip_records function).

Root cause: When skip_records completed successfully after crossing page boundaries, the has_partial state in the RepetitionLevelDecoder could incorrectly remain true.

This happened when:

  • The skip operation exhausted a page where has_record_delimiter was false
  • The skip found the remaining records on the next page by counting a delimiter at index 0
  • When a subsequent read_records(1) was called, the stale has_partial=true state caused count_records to incorrectly interpret the first repetition level (0) at index 0 as ending a "phantom" partial record, returning (1 record, 0 levels, 0 values) instead of properly reading the actual record data.

For a more descriptive explanation, look here: #9370 (comment)

What changes are included in this PR?

Added code at the end of skip_records to reset the partial record state when all requested records have been successfully skipped.

This ensures that after skip_records completes, we're at a clean record boundary with no lingering partial record state, fixing the array length mismatch in StructArrayReader.

Are these changes tested?

In b52e043 I added a test that I validated to fail whenever I remove my fix.

  Bug Mechanism                                                                                                                                                                                             
                                                                                                                                                                                                            
  The bug requires three ingredients:                                                                                                                                                                       

  1. Page 1 (DataPage v1): Contains a nested column (with rep levels). During skip_records, all levels on this page are consumed. count_records sees no following rep=0 delimiter, so it sets               
  has_partial=true. Since has_record_delimiter is false (the default InMemoryPageReader returns false when more pages exist), flush_partial is not called.
  2. Page 2 (DataPage v2): Has num_rows available in its metadata. When num_rows <= remaining_records, the entire page is skipped via skip_next_page() — this does not touch the rep level decoder at all,
  so has_partial remains stale true from page 1.
  3. Page 3 (DataPage v1): When read_records loads this page, the stale has_partial=true causes the rep=0 at position 0 to be misinterpreted as completing a "phantom" partial record. This produces (1
  record, 0 levels, 0 values) instead of reading the actual record data.

  Test Verification

  - With fix (flush_partial at end of skip_records): read_records(1) correctly returns (1, 2, 2) with values [70, 80]
  - Without fix: read_records(1) returns (1, 0, 0) — a phantom record with no data, which is what causes the "Not all children array length are the same!" error when different sibling columns in a struct
  produce different record counts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet Changes to the parquet crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Error "Not all children array length are the same!" when decoding rows spanning across page boundaries in parquet file when using RowSelection

1 participant