Skip to content

Commit d379b98

Browse files
authored
[Parquet] Minor: Update comments in page decompressor (#8764)
# Which issue does this PR close? - follow on to #8756 # Rationale for this change @etseidl comments: #8756 (comment) > Not relevant to this PR, but I think this TODO has largely been addressed by #8376 which enabled skipping the decoding of the page statistics. While I was in here, I also wanted to capture the learning based on @mapleFU 's comment #8756 (comment) > The code looks good to me but the I don't know if the comment "not compressed" can be replaced, if decompress_buffer is called and decompressed_size == 0 , seems that it generally means something like "this page only have levels, but not have non-null values"? ( Point me out if I'm wrong) # What changes are included in this PR? Include some comments # Are these changes tested? No (there are no code changes) # Are there any user-facing changes? No, this is internal comments only. No code / behavior changes
1 parent 220d0ea commit d379b98

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

parquet/src/file/serialized_reader.rs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -387,8 +387,6 @@ pub(crate) fn decode_page(
387387
can_decompress = header_v2.is_compressed.unwrap_or(true);
388388
}
389389

390-
// TODO: page header could be huge because of statistics. We should set a
391-
// maximum page header size and abort if that is exceeded.
392390
let buffer = match decompressor {
393391
Some(decompressor) if can_decompress => {
394392
let uncompressed_page_size = usize::try_from(page_header.uncompressed_page_size)?;
@@ -398,6 +396,8 @@ pub(crate) fn decode_page(
398396
let decompressed_size = uncompressed_page_size - offset;
399397
let mut decompressed = Vec::with_capacity(uncompressed_page_size);
400398
decompressed.extend_from_slice(&buffer[..offset]);
399+
// decompressed size of zero corresponds to a page with no non-null values
400+
// see https://github.com/apache/parquet-format/blob/master/README.md#data-pages
401401
if decompressed_size > 0 {
402402
let compressed = &buffer[offset..];
403403
decompressor.decompress(compressed, &mut decompressed, Some(decompressed_size))?;

0 commit comments

Comments
 (0)