Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 9 additions & 3 deletions datafusion/functions-nested/src/set_ops.rs
Original file line number Diff line number Diff line change
Expand Up @@ -516,11 +516,16 @@ fn general_array_distinct<OffsetSize: OffsetSizeTrait>(
let mut new_arrays = Vec::with_capacity(array.len());
let converter = RowConverter::new(vec![SortField::new(dt)])?;
// distinct for each list in ListArray
for arr in array.iter().flatten() {
for arr in array.iter() {
let last_offset: OffsetSize = offsets.last().copied().unwrap();
let Some(arr) = arr else {
// Add same offset for null
offsets.push(last_offset);
continue;
};
let values = converter.convert_columns(&[arr])?;
// sort elements in list and remove duplicates
let rows = values.iter().sorted().dedup().collect::<Vec<_>>();
let last_offset: OffsetSize = offsets.last().copied().unwrap();
offsets.push(last_offset + OffsetSize::usize_as(rows.len()));
let arrays = converter.convert_rows(rows)?;
let array = match arrays.first() {
Expand All @@ -538,6 +543,7 @@ fn general_array_distinct<OffsetSize: OffsetSizeTrait>(
Arc::clone(field),
offsets,
values,
None,
// Keep the list nulls
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

array.nulls().cloned(),
)?))
}
7 changes: 7 additions & 0 deletions datafusion/sqllogictest/test_files/array.slt
Original file line number Diff line number Diff line change
Expand Up @@ -5674,6 +5674,13 @@ select array_distinct([sum(a)]) from t1 where a > 100 group by b;
statement ok
drop table t1;

query ?
select array_distinct(a) from values ([1, 2, 3]), (null), ([1, 3, 1]) as X(a);
----
[1, 2, 3]
NULL
[1, 3]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that datafusion-cli -c select array_distinct(null); should also succeed?
It seems that array_distinct only accepts arguments of array type.

DataFusion CLI v44.0.0
Error: Error during planning: Error during planning: Failed to coerce arguments to satisfy a call to array_distinct function: coercion from [Null] to the signature ArraySignature(Array) failed. No function matches the given name and argument types 'array_distinct(Null)'. You might need to add explicit type casts.
        Candidate functions:
        array_distinct(array)

ref.

signature: Signature::array(Volatility::Immutable),

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that datafusion-cli -c select array_distinct(null); should also succeed? It seems that array_distinct only accepts arguments of array type.

I would expect that array_distinct(null) would return null as well. A few lines up it seems there is a reference to

#TODO: https://github.com/apache/datafusion/issues/7142
#query ?
#select array_distinct(null);
#----
#NULL

I tried it with this PR and found the query still doesn't work

Thus I think this PR neither makes the behavior better or worse


query ?
select array_distinct([]);
----
Expand Down
Loading