-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
range end index 294912 out of range for slice of length 147456 #540
Comments
@andygrove Is there any reproducible example? |
Btw, it doesn't fail in shuffle writer. From the stack trace, it looks like when the writer to pull next batch from its upstream, the error happened. |
I was running TPC-H @ 100gb |
It failed on q10. I am adding debug logging to see if I can track this down. |
The error is happening in unsafe code in arrow-rs. Here is some debug output showing the calls leading up the the error:
Note that there are many earlier calls that look identical and do not fail. The error happens in
This function calls pub(super) unsafe fn get_last_offset<T: ArrowNativeType>(offset_buffer: &MutableBuffer) -> T {
// JUSTIFICATION
// Benefit
// 20% performance improvement extend of variable sized arrays (see bench `mutable_array`)
// Soundness
// * offset buffer is always extended in slices of T and aligned accordingly.
// * Buffer[0] is initialized with one element, 0, and thus `mutable_offsets.len() - 1` is always valid.
let (prefix, offsets, suffix) = offset_buffer.as_slice().align_to::<T>();
debug_assert!(prefix.is_empty() && suffix.is_empty());
*offsets.get_unchecked(offsets.len() - 1)
} |
I ran with a debug build and got a more detailed stack trace:
The error happens in let new_values = &values[start_values..end_values]; |
Here is some more debug info, showing the size of the buffers in the array data and the value in the first element of each buffer:
|
Ok. I tried TPC-H @ 100gb and TPC-H @ 1gb. Only TPC-H @ 100gb can reproduce it. I will look into this. |
I got some hint (I know which operator causes it) after debugging it for a while, though I don't get the root cause yet. |
I've just figured out where the root cause is. I will go to propose a fix to arrow-rs. EDIT: actually the issue is happened in Java Arrow instead of arrow-rs. |
I opened an issue at Java Arrow repo and described the root cause: apache/arrow#42156. Fixing it there might wait for a longer release period. I'm thinking to copy |
Describe the bug
I just saw this when running benchmarks with latest from main and with xxhash64 disabled.
Steps to reproduce
No response
Expected behavior
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: