Skip to content

make_array does not properly support nulls  #6887

Closed
@alamb

Description

@alamb

Describe the bug

The make_array function doesn't appear to properly support nulls for string arrays

To Reproduce

When I run this script

create table foo as values ('foo', null), ('bar', null), (null, 'baz');
select * from foo;
select
  make_array(column1, column2)[1],
  make_array(column1, column2)[1] IS NULL
from foo;

select
  make_array(column1, column2)[2],
  make_array(column1, column2)[2] IS NULL
from foo;

I get the following output (note that the output of make_array that had a null input does not have a null output):

0 rows in set. Query took 0.004 seconds.
+---------+---------+
| column1 | column2 |
+---------+---------+
| foo     |         |
| bar     |         |
|         | baz     |
+---------+---------+
3 rows in set. Query took 0.002 seconds.
+----------------------------------------+------------------------------------------------+
| make_array(foo.column1,foo.column2)[1] | make_array(foo.column1,foo.column2)[1] IS NULL |
+----------------------------------------+------------------------------------------------+
| foo                                    | false                                          |
| bar                                    | false                                          |
|                                        | false                                          |
+----------------------------------------+------------------------------------------------+
3 rows in set. Query took 0.007 seconds.
+----------------------------------------+------------------------------------------------+
| make_array(foo.column1,foo.column2)[2] | make_array(foo.column1,foo.column2)[2] IS NULL |
+----------------------------------------+------------------------------------------------+
|                                        | false                                          |
|                                        | false                                          |
| baz                                    | false                                          |
+----------------------------------------+------------------------------------------------+
3 rows in set. Query took 0.007 seconds.

Expected behavior

This is what used to happen (note the true/false):

(arrow_dev) alamb@MacBook-Pro-8:~/Software/arrow-datafusion2/benchmarks$ datafusion-cli -f /tmp/repro.sql
DataFusion CLI v27.0.0
0 rows in set. Query took 0.002 seconds.
+---------+---------+
| column1 | column2 |
+---------+---------+
| foo     |         |
| bar     |         |
|         | baz     |
+---------+---------+
3 rows in set. Query took 0.001 seconds.
+----------------------------------------+------------------------------------------------+
| make_array(foo.column1,foo.column2)[1] | make_array(foo.column1,foo.column2)[1] IS NULL |
+----------------------------------------+------------------------------------------------+
| foo                                    | false                                          |
| bar                                    | false                                          |
|                                        | true                                           |
+----------------------------------------+------------------------------------------------+
3 rows in set. Query took 0.002 seconds.
+----------------------------------------+------------------------------------------------+
| make_array(foo.column1,foo.column2)[2] | make_array(foo.column1,foo.column2)[2] IS NULL |
+----------------------------------------+------------------------------------------------+
|                                        | true                                           |
|                                        | true                                           |
| baz                                    | false                                          |
+----------------------------------------+------------------------------------------------+
3 rows in set. Query took 0.001 seconds.

Additional context

I found this while trying to update IOx to use the latest datafusion: https://github.com/influxdata/influxdb_iox/pull/8127

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions