Closed
Description
Describe the bug
The array_contains
code seems to be overly-flattening input lists and in doing so is generating incorrect data when one of the parameters is a column of List-type.
To Reproduce
I've attached a parquet table containing a column with type List(String).
When use array_contains on this data, I get this result set:
❯ create external table t0 stored as parquet location '/Users/max/tmp/array_contains.parquet';
0 rows in set. Query took 0.017 seconds.
❯ select bid_node_ids from t0 where array_contains(bid_node_ids, ['z+CPVybgUuCXlAE3A3jqyg==']);
+----------------------------+
| bid_node_ids |
+----------------------------+
| [okwzcOFM3yjUzNFbc/BYBQ==] |
| [DbNysJTF560NzR/HLbAa/Q==] |
| [ivO3+Z+WMRqwhivy85d6KA==] |
+----------------------------+
3 rows in set. Query took 0.076 seconds.
❯
Note that none of the resulting bid_node_ids
values contain the queried-for value of z+CPVybgUuCXlAE3A3jqyg==
Expected behavior
I was expecting that there are 861 matching results in the result set, all of which contain the value z+CPVybgUuCXlAE3A3jqyg==
❯ select bid_node_ids from t0 where array_contains(bid_node_ids, ['z+CPVybgUuCXlAE3A3jqyg==']);
+--------------------------------------------------------------------------------+
| bid_node_ids |
+--------------------------------------------------------------------------------+
| [wFEkOS2AFYxekv7SzPrkiQ==, z+CPVybgUuCXlAE3A3jqyg==] |
[....snip...]
| [O3GAOhhCbfxgXcZEwLI7aQ==, z+CPVybgUuCXlAE3A3jqyg==] |
| [O3GAOhhCbfxgXcZEwLI7aQ==, z+CPVybgUuCXlAE3A3jqyg==] |
| [z+CPVybgUuCXlAE3A3jqyg==] |
| [iTd7HyShRr0PqSKyqKT0+A==, z+CPVybgUuCXlAE3A3jqyg==] |
| [edSh3ZpG53UB+JMV875ipg==, z+CPVybgUuCXlAE3A3jqyg==] |
| [O3GAOhhCbfxgXcZEwLI7aQ==, z+CPVybgUuCXlAE3A3jqyg==] |
| [edSh3ZpG53UB+JMV875ipg==, z+CPVybgUuCXlAE3A3jqyg==] |
| [O3GAOhhCbfxgXcZEwLI7aQ==, z+CPVybgUuCXlAE3A3jqyg==] |
| [O3GAOhhCbfxgXcZEwLI7aQ==, z+CPVybgUuCXlAE3A3jqyg==] |
| [edSh3ZpG53UB+JMV875ipg==, z+CPVybgUuCXlAE3A3jqyg==] |
| [edSh3ZpG53UB+JMV875ipg==, z+CPVybgUuCXlAE3A3jqyg==] |
+--------------------------------------------------------------------------------+
861 rows in set. Query took 1.069 seconds.
Additional context
I've hacked together a change on our branch that gives us the changes we are expecting: urbanlogiq@a381f10 but I'm not sure if this fix is what is intended by the original author.