- 
                Notifications
    You must be signed in to change notification settings 
- Fork 3.9k
Description
Describe the bug, including details regarding any error messages, version, and platform.
When resizing the underlying buffer for the var-length content of the row table, we do:
arrow/cpp/src/arrow/compute/row/row_internal.cc
Lines 296 to 299 in 674e221
| int64_t num_bytes = offsets()[num_rows_]; | |
| if (bytes_capacity_ >= num_bytes + num_extra_bytes || metadata_.is_fixed_length) { | |
| return Status::OK(); | |
| } | 
It is treating the second buffer (row content if the row table is fixed length, or offset otherwise) as offset regardless of the fix-length-ness. The fix-length-ness is checked afterwards, in which case resizing the var-length buffer is unnecessary and return.
But treating the second buffer as offset unconditionally is problematic because, at least but not last, it could be sized less than required by an offset buffer. Consider a row table containing only one uint8 column and alignment being 1 byte, there will be 1 byte per row, less than 4 bytes per row as an offset, causing the offset access beyond the buffer boundary.
I have a repro case in my local and will send out as UT with my fix PR.
Component(s)
C++