Skip to content

Conversation

@rambleraptor
Copy link
Contributor

Which issue does this PR close?

We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax.

Rationale for this change

The DeltaBitPackDecoder can panic if it encounters a bit width in the encoded data that is larger than the bit width of the data type being decoded.

@github-actions github-actions bot added the parquet Changes to the parquet crate label Oct 22, 2025
@mapleFU
Copy link
Member

mapleFU commented Oct 23, 2025

I wonder would this actually happens? Do someone meets this issue?

@etseidl
Copy link
Contributor

etseidl commented Oct 23, 2025

I wonder would this actually happens? Do someone meets this issue?

It can happen if an encoder calculates the deltas in the wrong domain (I know because my first attempt at writing an encoder for cudf did just this). The spec had been ambiguous on this point, but we changed the wording to disallow this situation (apache/parquet-format#231).

Copy link
Contributor

@etseidl etseidl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rambleraptor, this looks good to me (mod lint).

#[test]
fn test_delta_bit_packed_invalid_bit_width() {
// Manually craft a buffer with an invalid bit width
let mut buffer = vec![];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the documentation here, I think adding #[allow(clippy::vec_init_then_push)] is warranted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a #[allow and pushed a commit

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to resolve the CI failures

#[test]
fn test_delta_bit_packed_invalid_bit_width() {
// Manually craft a buffer with an invalid bit width
let mut buffer = vec![];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a #[allow and pushed a commit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet Changes to the parquet crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants