-
Notifications
You must be signed in to change notification settings - Fork 962
Fix RowConverter when FixedSizeList is not the last #7789
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix RowConverter when FixedSizeList is not the last #7789
Conversation
Fix `RowConverter` row decoding when there is a `FixedSizeList` element and it's not the last.
🤖 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me -- thank you @findepi
I plan to approve this PR pending a performance check
🤖: Benchmark completed Details
|
When columns are unequal sizes, then encoding them can panic. This can be observed by changing `test_fixed_size_list_with_variable_width_content`, eg adding one more element to the second list. What's unclear, the panic can be avoided by calling `tracker.materialized()` in `compute_lengths_fixed_size_list` (similar to how list encoding does it). However, we still won't encode the whole data passed in, so it's better to reject this upfront.
for colum in columns.iter().skip(1) { | ||
if colum.len() != columns[0].len() { | ||
return Err(ArrowError::InvalidArgumentError(format!( | ||
"RowConverter columns must all have the same length, expected {} got {}", | ||
columns[0].len(), | ||
colum.len() | ||
))); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another check added. This guards against passing ill-formed data from the user.
Without this, it's possible to get a panic inside the RowEncoder.
🤖 |
I am not sure what happened to the benchmark script -- I will rerun shortly |
🤖 |
🤖: Benchmark completed Details
|
I will rerun the benchmarks on this PR again -- some of the results look like noise |
🤖 |
🤖: Benchmark completed Details
|
It looks like |
Possibly -- it could also be noise in the benchmark the next step woudl be to see if we can reproduce the slowdown locally (run |
Which issue does this PR close?
none
Rationale for this change
RowConverter
decoding fails when there is aFixedSizeList
element and it's not the last.What changes are included in this PR?
Fix
RowConverter
row decoding when there is aFixedSizeList
element and it's not the last.Are these changes tested?
yes