Skip to content

All array functions should represent NULL as an element #7142

Open
@izveigor

Description

@izveigor

Is your feature request related to a problem or challenge?

Follow on to #6662
We have important problem related to using array functions with NULL statements. As @alamb said in PR (see below for full details) DataFusion casting system should be adapted for it, otherwise it will always throw errors.

The current implementation:

❯ select array_append([1, 2, 3, 4, 5], NULL);
Optimizer rule 'simplify_expressions' failed
caused by
This feature is not implemented: Array_append is not implemented for types 'Int64' and 'Null'.

Should be:

❯ select array_append([1, 2, 3, 4, 5], NULL);
----
[1, 2, 3, 4, 5, NULL]

Describe the solution you'd like

@alamb statement about the issue:

I am not sure about this approach of taking either a ListArray or a NullArray

In the other functions, the way NULL is treated is that the input types are always the same (in this case ListArray) and the values would be null (aka array.is_valid(i) would return false for rows that are null.

Complicating matters is if you type a literal null in sql like:

select array_concat([1,2], null)

That comes to DataFusion as a null literal (with DataType::Null). The coercion / casting logic normally will coerce this to the appropriate type.

For example, here is how I think arithmetic works with null:

select 1 + NULL

Arrives like

ScalarValue::Int32(Some(1)) + ScalarValue::Null

And then the coercion logic will add a cast to Int32:

ScalarValue::Int32(Some(1)) + CAST(ScalarValue::Null, DataType::Int32)

And then the constant folder will collapse this into:

ScalarValue::Int32(Some(1)) + ScalarValue::Int32(None)

So by the time the arithmetic kernel sees it, it only has to deal with arguments of Int32

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions