Skip to content

ABS() function in WHERE clause gives unexpected results #723

@mcassels

Description

@mcassels

Describe the bug
ABS(col - x) in WHERE clause of query sometimes does not filter results correctly. The test file has a float column with high-precision values. We want to use ABS(col - x) < y to do equality comparisons for high-precision float columns.

To Reproduce
datafusion cli example using test parquet file attached:

➜  arrow-datafusion git:(master) ✗ cargo run --bin datafusion-cli
    Finished dev [unoptimized + debuginfo] target(s) in 0.22s
     Running `target/debug/datafusion-cli`
> CREATE EXTERNAL TABLE foo STORED AS PARQUET LOCATION 'test.parquet';
0 rows in set. Query took 0.001 seconds.
> select * from foo;
+--------------------+
| c0                 |
+--------------------+
| 107.0090813093981  |
| 125.51519138755981 |
| 141.83342587451415 |
| 113.65534481251639 |
| 251.10794896957802 |
| 112.08361695028363 |
+--------------------+
6 rows in set. Query took 0.006 seconds.
> select c0, ABS(c0 - 251.10794896957802) from foo;
+--------------------+-------------------------------------------+
| c0                 | abs(c0 Minus Float64(251.10794896957802)) |
+--------------------+-------------------------------------------+
| 107.0090813093981  | 144.0988676601799                         |
| 125.51519138755981 | 125.59275758201821                        |
| 141.83342587451415 | 109.27452309506387                        |
| 113.65534481251639 | 137.45260415706161                        |
| 251.10794896957802 | 0                                         |
| 112.08361695028363 | 139.02433201929438                        |
+--------------------+-------------------------------------------+
6 rows in set. Query took 0.006 seconds.
> select c0, ABS(c0 - 251.10794896957802) from foo where ABS(c0 - 251.10794896957802) < 1;
0 rows in set. Query took 0.005 seconds.
> select c0, ABS(c0 - 251.10794896957802) from foo where ABS(c0 - 251.10794896957802) < 111;
0 rows in set. Query took 0.003 seconds.
> select c0, ABS(c0 - 251.10794896957802) from foo where ABS(c0 - 251.10794896957802) < 150;
+--------------------+-------------------------------------------+
| c0                 | abs(c0 Minus Float64(251.10794896957802)) |
+--------------------+-------------------------------------------+
| 107.0090813093981  | 144.0988676601799                         |
| 125.51519138755981 | 125.59275758201821                        |
| 141.83342587451415 | 109.27452309506387                        |
| 113.65534481251639 | 137.45260415706161                        |
| 251.10794896957802 | 0                                         |
| 112.08361695028363 | 139.02433201929438                        |
+--------------------+-------------------------------------------+
6 rows in set. Query took 0.007 seconds.

Expected behavior
The query

> select c0, ABS(c0 - 251.10794896957802) from foo where ABS(c0 - 251.10794896957802) < 1;

was expected to give the following 1 row:

+--------------------+-------------------------------------------+
| c0                 | abs(c0 Minus Float64(251.10794896957802)) |
+--------------------+-------------------------------------------+
| 251.10794896957802 | 0                                         |

And the query

> select c0, ABS(c0 - 251.10794896957802) from foo where ABS(c0 - 251.10794896957802) < 111;

was expected to give the following 2 rows:

+--------------------+-------------------------------------------+
| c0                 | abs(c0 Minus Float64(251.10794896957802)) |
+--------------------+-------------------------------------------+
| 251.10794896957802 | 0                                         |
| 141.83342587451415 | 109.27452309506387                        |

Additional context
Test parquet: test.parquet.zip

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions