-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support array_resize
#8744
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, but I have a different idea about array_size
.
We can apply the combination of existing array_length
, array_slice
, array_repeat
and array_concat
to resize the array. But I'm not sure is it faster or slower, is it easier to maintain or not. How do you think about this approach @Weijun-H ?
array_concat(array_slice(array, 0, min(array_length, count)), array_repeat(default_array, max(count-array_length), 0))
|
||
for (index, arr) in array.iter().enumerate() { | ||
let count = count_array.value(index).to_usize().ok_or_else(|| { | ||
DataFusionError::Execution( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: exec_err
array: &GenericListArray<O>, | ||
count_array: &Int64Array, | ||
field: &FieldRef, | ||
new_element: Option<ArrayRef>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default_element
might be more concise?
I think this approach is not a better idea, as supporting the column for |
thanks @Weijun-H I'll review it tomorrow more closely if no one else beats me. |
let new_rows = vec![&default_value; count - remain_count]; | ||
rows.extend(new_rows); | ||
|
||
let last_offset = offsets.last().copied().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please handle the potential error on .unwrap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Weijun-H tests are awesome.
Im just wondering could we just concat 2 arrays(old+new one with default values) to do a resize?
I have slight concern that performance on current implementation, maybe I'm wrong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment was marked as outdated.
This comment was marked as outdated.
e66acd6
to
cba3269
Compare
|
||
for (row_index, offset_window) in array.offsets().windows(2).enumerate() { | ||
let count = count_array.value(row_index).to_usize().ok_or_else(|| { | ||
exec_datafusion_err!("array_resize: failed to convert size to usize") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does type conversion more like internal_err?
wasn't expected/anticipated by the implementation and that is most likely a bug (the error message even encourages users to open a bug report)
unlike array.len() != number, which is error
from user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much better with MutableArray! LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which issue does this PR close?
Closes #7194
Rationale for this change
array_resize
LargeList
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?