-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
Having return_type does not really reflect the UDF actual nullability.
the following problems exists when UDF wrongly report their nullability:
Disallow optimization based on the expressions nullability.
For example lets assume I have 2 columns:
col1which is i64 without nullscol2which is i64 with nulls
and the expression ceil does not does not implement return_type_from_args, so by default the nullability is true
can't optimize the following query and
this means that I can can't optimize this query
select coalesce(ceil('col1'), 'col2') from tbl;
to be this:
select ceil('col1') from tbl;
because ceil says that the nullability is true even though the input expression is not nullable
the same can be for not able to remove the expression array_remove_all(array, null) if the array expression report that it is nullable while it is not
Wrong types
creating the wrong type:
> select arrow_typeof(make_array(1));
+------------------------------------------------------------------------------------------------------------------+
| arrow_typeof(make_array(Int64(1))) |
+------------------------------------------------------------------------------------------------------------------+
| List(Field { name: "item", data_type: Int64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }) |
+------------------------------------------------------------------------------------------------------------------+
1 row(s) fetched.
Elapsed 0.002 seconds.
should be nullable false but because make_array implement return_type it does not get that information