Skip to content

Conversation

@avkirilishin
Copy link
Contributor

Which issue does this PR close?

Closes #14099.

What changes are included in this PR?

Are these changes tested?

Yes

Are there any user-facing changes?

No

@github-actions github-actions bot added the physical-expr Changes to the physical-expr crates label Jan 16, 2025
… errors for InfallibleExprOrNull eval method
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I verified that this PR fixes the reported error. Thank you @avkirilishin and @Omega359

I think it would be good to get

I actually ran the extended test and verified this does indeed fix the error:

INCLUDE_SQLITE=true cargo test --test sqllogictests
     Running bin/sqllogictests.rs (target/release/deps/sqllogictests-1a239c0394d49df0)
Completed in 5 minutes                                                                                                                                                                                                                                                                                            External error: query is expected to fail, but actually succeed:
[SQL] SELECT + - 46 / COALESCE ( + 42, + COALESCE ( - COUNT ( * ), 89 ) / + ( 80 ) + 67 - - - 17 / - + 35 * + 91 * + 22, - 40 ) + - 10 * CASE WHEN NULL > NULL THEN + AVG ( ALL 60 ) END AS col0
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_39.slt:356

External error: query is expected to fail, but actually succeed:
[SQL] SELECT ALL ( - CASE WHEN NOT NULL NOT BETWEEN NULL AND - 67 THEN - - SUM ( ALL + 29 ) END ) * 13 - 8 AS col2
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_12.slt:37580

External error: query is expected to fail, but actually succeed:
[SQL] SELECT ALL - CAST ( NULLIF ( - CASE WHEN NOT 40 < NULL THEN MIN ( + CAST ( NULL AS INTEGER ) ) ELSE NULL END, + 50 + 18 ) AS INTEGER ) AS col1
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_15.slt:12140

External error: query is expected to fail, but actually succeed:
[SQL] SELECT 30 AS col2, - COUNT ( * ) * + 55 * + 57 * + NULLIF ( + CASE WHEN NOT ( + 13 ) IS NULL THEN - COUNT ( * ) END, - 53 )
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_38.slt:43114

External error: query is expected to fail, but actually succeed:
[SQL] SELECT ALL ( - CASE WHEN ( NULL ) IS NOT NULL OR NOT ( NOT + 56 IS NOT NULL ) THEN SUM ( - 80 ) END ) * CASE 85 WHEN AVG ( + 8 ) + + 0 THEN NULL WHEN - SUM ( DISTINCT 60 ) THEN 19 WHEN - COUNT ( * ) THEN NULL ELSE + CASE - ( + 55 ) WHEN 2 THEN 6 * 94 ELSE NULL END - - 48 END - 31
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_28.slt:25817

External error: query is expected to fail, but actually succeed:
[SQL] SELECT ALL MAX ( DISTINCT 27 ) * - CASE WHEN ( NULL ) BETWEEN - CAST ( - 86 AS INTEGER ) AND + 46 + + 97 THEN COUNT ( * ) END * - 66 AS col2
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_64.slt:16401

External error: query is expected to fail, but actually succeed:
[SQL] SELECT DISTINCT + COALESCE ( + 71, + + 72 * CASE WHEN - 74 IS NULL THEN + MIN ( 18 ) END ) col2
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_14.slt:51693

External error: query is expected to fail, but actually succeed:
[SQL] SELECT ALL ( - COUNT ( * ) ) * - CASE WHEN - 38 IS NOT NULL THEN ( - COUNT ( * ) ) ELSE NULL END AS col1
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_4.slt:15996

External error: query is expected to fail, but actually succeed:
[SQL] SELECT CASE WHEN + ( + 93 ) IS NULL THEN MIN ( ALL - + 0 ) ELSE NULL END / + 62 - + - 76 * COUNT ( * )
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_108.slt:31436

External error: query is expected to fail, but actually succeed:
[SQL] SELECT 86, 45 + CAST ( + - 27 AS FLOAT8 ) + - 72 * + - NULLIF ( + 69, + + 96 ) * + + CASE WHEN NULL = NULL THEN COUNT ( * ) END * - 97 * 66 AS col1
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_73.slt:26746

External error: query is expected to fail, but actually succeed:
[SQL] SELECT + 27 / - 76 * - ( ( - 21 ) ) * - CASE WHEN NOT + 53 = + 28 THEN COUNT ( * ) END * + CASE + 83 WHEN CASE COUNT ( * ) WHEN + 88 THEN 61 - + 10 WHEN 75 THEN + 25 + MIN ( DISTINCT + 35 - - 75 ) ELSE NULL END THEN COALESCE ( + ( + 74 ), - ( 42 ) ) * + COUNT ( * ) ELSE NULL END
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_72.slt:25043

External error: query is expected to fail, but actually succeed:
[SQL] SELECT + COALESCE ( - 53, + 23, + 80 ) * ( + 31 ) + 68 + + 62 - + + CASE 51 WHEN - - 16 * - CASE WHEN - COUNT ( * ) > + + COALESCE ( - COALESCE ( 28, - ( - - CASE WHEN 87 BETWEEN + + COUNT ( * ) + - 89 AND - 80 + - - 1 + 85 AND NULL NOT BETWEEN NULL AND 81 + CASE CASE - 96 WHEN + ( + + 58 ) * + 87 THEN 77 ELSE - 40 * + 80 END WHEN 73 - 40 * - COUNT ( * ) THEN COALESCE ( COUNT ( * ), + 12 ) + - 47 WHEN + 53 + + 7 / 17 THEN + 92 * 0 WHEN 9 / - COALESCE ( + COUNT ( * ), SUM ( 5 ) * COUNT ( * ) + - CASE + 89 + 80 WHEN + 10 THEN NULL WHEN 36 THEN 19 + - 14 ELSE 12 END, COUNT ( * ) ) THEN NULL ELSE NULL END THEN COUNT ( * ) END ) - - 11, - 40 - 28, + 65 ), COUNT ( ALL - 50 ) ) THEN + 8 ELSE COUNT ( * ) END THEN - COUNT ( * ) + + NULLIF ( 46 + 30 * 71, + 94 + + MIN ( ALL 22 + + COALESCE ( 88, 48 ) * CAST ( 62 * 52 AS FLOAT8 ) ) ) WHEN + CAST ( NULL AS INTEGER ) THEN 70 WHEN - 89 THEN NULL END
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_66.slt:21439

External error: query is expected to fail, but actually succeed:
[SQL] SELECT - 33 - CASE WHEN NOT - 5 IS NOT NULL THEN + AVG ( + + 28 ) ELSE NULL END * + 15 * - + ( - + 58 ) + + 67 + - 1 * - + 43, + 20 AS col1
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_7.slt:30971

External error: query is expected to fail, but actually succeed:
[SQL] SELECT ALL - 72 * + + 61 * 99 * - 54 + + + NULLIF ( 23, - - COALESCE ( + + CASE WHEN NOT ( NULL ) <> ( NULL ) THEN MAX ( - 49 ) END, 16 - + 76 ) + 56 ) + + CASE - 20 WHEN COUNT ( * ) THEN NULL WHEN COUNT ( * ) THEN + 46 END * + 80 * + 62 AS col2
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_88.slt:48514

External error: query is expected to fail, but actually succeed:
[SQL] SELECT + ( - 56 ) * - CASE WHEN NOT - 7 NOT IN ( + + 75 ) THEN SUM ( ALL - 79 ) ELSE NULL END
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_45.slt:11343

External error: query is expected to fail, but actually succeed:
[SQL] SELECT DISTINCT - - COUNT ( * ) * CASE WHEN NOT NULL IS NULL THEN COUNT ( + 39 ) END / - - 46 + - 88 AS col0
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_114.slt:7324

External error: query is expected to fail, but actually succeed:
[SQL] SELECT DISTINCT COUNT ( * ) + + AVG ( ALL + CASE - + 89 WHEN + 71 THEN 72 WHEN 63 THEN NULL ELSE NULL END ) * COUNT ( * ) / + 78 * + + NULLIF ( + - NULLIF ( ( + + MIN ( - NULLIF ( + + 57, - + 74 ) ) ), SUM ( ALL + 69 ) ), - 16 ) - - 47 - SUM ( 39 ) * + COUNT ( * ) * - CASE WHEN ( NULL ) >= NULL THEN COUNT ( * ) ELSE NULL END - 7 AS col1
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_118.slt:52009

External error: query is expected to fail, but actually succeed:
[SQL] SELECT ALL CASE WHEN 47 IS NULL THEN - - MAX ( ALL - 69 ) END
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_48.slt:46223

External error: query is expected to fail, but actually succeed:
[SQL] SELECT ALL - 58 * 59 * ( - - CASE WHEN 12 IS NULL THEN + COUNT ( DISTINCT + CAST ( - + 88 AS INTEGER ) ) END ) * - 70 * - COUNT ( ALL 35 ) * + 68 + + - 71
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_78.slt:36589

External error: query is expected to fail, but actually succeed:
[SQL] SELECT - 95 / 75 / CAST ( NULL AS INTEGER ) + + 33 - - 46 * - - CASE WHEN NOT - 7 = NULL THEN COUNT ( * ) END + - - 13 * + 58 AS col2, 11 * - + 56 * 48 * - 20 AS col1
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_52.slt:19334

External error: query is expected to fail, but actually succeed:
[SQL] SELECT DISTINCT - 70 + + CASE WHEN NOT ( NULLIF ( + CAST ( + 14 AS INTEGER ), + 21 ) ) IS NULL THEN MAX ( - 96 ) ELSE NULL END + - + 76
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_46.slt:29752

External error: query is expected to fail, but actually succeed:
[SQL] SELECT CASE WHEN NOT ( 92 ) <= - 0 THEN + MIN ( + 30 ) ELSE NULL END col0
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_85.slt:15455

External error: query is expected to fail, but actually succeed:
[SQL] SELECT - 79 * + 91 * - COUNT ( * ) * + - 2 * + - NULLIF ( - 49, - COALESCE ( - + 69, - COALESCE ( + COALESCE ( - 20, ( - 18 ) * + COUNT ( * ) + - 93, - CASE 51 WHEN + COUNT ( * ) + 28 THEN 0 ELSE + 29 * + CASE ( 50 ) WHEN - ( - ( CASE WHEN NOT + 37 IS NULL THEN + COUNT ( * ) END ) ) THEN NULL WHEN - 46 + 87 * - 28 THEN 85 WHEN - COUNT ( * ) THEN NULL END END ), COUNT ( * ) - 39 ) * + 22 ) / - COUNT ( * ) )
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_102.slt:9959

External error: query is expected to fail, but actually succeed:
[SQL] SELECT CASE WHEN - 62 BETWEEN + 16 + + 55 AND + CASE + COALESCE ( 88, 74 * 1, + CAST ( NULL AS INTEGER ) / COUNT ( * ) ) WHEN COUNT ( 21 ) THEN 40 * CAST ( - 24 AS INTEGER ) ELSE NULL END THEN COUNT ( ALL + 78 ) ELSE NULL END AS col0
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_50.slt:48138

External error: query is expected to fail, but actually succeed:
[SQL] SELECT + NULLIF ( + 41, - 91 / + CASE + + 84 WHEN + 38 * - 6 THEN 90 WHEN - AVG ( - 40 ) + - 22 * + - 4 THEN 27 ELSE - CASE WHEN NOT 71 <> NULL THEN + COUNT ( * ) END END - + - 59 * - - 45 ) AS col0
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_94.slt:21191

External error: query is expected to fail, but actually succeed:
[SQL] SELECT + CASE WHEN + 60 IS NOT NULL THEN + MAX ( 53 ) ELSE NULL END col1
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_113.slt:274

External error: query is expected to fail, but actually succeed:
[SQL] SELECT + + 92 * - 60 + - - COALESCE ( - ( CAST ( - 32 AS INTEGER ) ), - + CAST ( + ( - NULLIF ( + SUM ( DISTINCT + 60 ), CASE - 64 WHEN - CASE WHEN NULL >= ( NULL ) THEN MAX ( 91 ) END THEN 12 / + 23 WHEN - 68 THEN NULL END ) ) AS INTEGER ), + 35 ) col2
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_53.slt:43683

External error: query is expected to fail, but actually succeed:
[SQL] SELECT + CASE 87 WHEN ( MAX ( DISTINCT 4 ) ) - MAX ( DISTINCT 61 ) THEN - 25 END + CASE - CASE WHEN ( NULL ) <= + + 17 THEN + MAX ( 15 ) ELSE NULL END WHEN - + 65 THEN - 49 ELSE + - 17 END AS col1
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_80.slt:40456

External error: query is expected to fail, but actually succeed:
[SQL] SELECT ALL - 42 * ( COUNT ( * ) ) AS col2, - CASE WHEN NOT + 39 IS NOT NULL THEN + SUM ( DISTINCT + - 1 ) ELSE NULL END * 56 - - + 50 + - 53 * + - 36 AS col1
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_81.slt:12321

External error: query is expected to fail, but actually succeed:
[SQL] SELECT - CASE - + CASE WHEN ( ( - - 17 ) IS NOT NULL ) THEN AVG ( ALL 79 ) ELSE NULL END WHEN 49 THEN 39 * 35 ELSE NULL END
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_95.slt:33226

External error: query is expected to fail, but actually succeed:
[SQL] SELECT + + ( - + 22 ) / + CASE WHEN + 73 IS NOT NULL THEN + ( MAX ( 78 ) ) END AS col1
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_105.slt:11744

External error: query is expected to fail, but actually succeed:
[SQL] SELECT - - CASE WHEN NOT NULL = NULL THEN MIN ( DISTINCT - ( - 23 ) ) END + 98 - + NULLIF ( 71, + 12 + 81 * 1 ) - 90 / + ( + SUM ( - COALESCE ( 54, - ( 56 ) ) ) * 52 )
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_68.slt:42250

External error: query is expected to fail, but actually succeed:
[SQL] SELECT ALL CAST ( NULL AS INTEGER ) + + CASE WHEN NOT 22 >= 79 + + 76 THEN MIN ( - 62 ) END * - NULLIF ( 59, + AVG ( ALL 10 ) ) AS col2
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_41.slt:4264

External error: query is expected to fail, but actually succeed:
[SQL] SELECT + CASE WHEN NOT ( NULL ) IS NOT NULL THEN + MAX ( DISTINCT + + 99 ) END AS col0
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_42.slt:51666

External error: query is expected to fail, but actually succeed:
[SQL] SELECT DISTINCT ( + CASE WHEN NULL = NULL THEN SUM ( ALL - 99 ) END )
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_82.slt:30305

External error: query is expected to fail, but actually succeed:
[SQL] SELECT - COALESCE ( 99, + 49 + - 56 * - 40 - CAST ( + + CASE - 8 WHEN + CASE 80 WHEN + + ( + 82 ) * 7 THEN NULL WHEN ( - 36 ) * 89 THEN + 14 END THEN NULL WHEN CASE WHEN NULL IS NULL THEN + COUNT ( * ) END * - 91 THEN ( 35 ) + - CAST ( - SUM ( 62 ) AS INTEGER ) / 33 WHEN + COUNT ( * ) * - 13 + + 32 * 93 THEN + 89 + 8 END AS INTEGER ) * + 92 ) * SUM ( + 45 * CAST ( NULL AS INTEGER ) ) - ( + 57 )
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_104.slt:33182

External error: query is expected to fail, but actually succeed:
[SQL] SELECT ALL CASE - COUNT ( * ) WHEN 98 THEN - + ( + COUNT ( * ) ) - + 28 / - 3 WHEN - CASE WHEN NOT ( NULL ) > NULL THEN - COUNT ( * ) END + MIN ( ALL 78 ) THEN 78 * COALESCE ( + 15, - 42, SUM ( - 38 ) * - CAST ( - 89 AS INTEGER ), 65 + + 73 ) END / 50 - ( + 85 + 58 / + 15 )
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_111.slt:50423

External error: query is expected to fail, but actually succeed:
[SQL] SELECT + + 39 * + - COALESCE ( CASE WHEN NOT ( NULL ) <= NULL THEN + ( SUM ( 88 ) ) END, + 53 ) * - 97 * - ( - - 13 ) * 11 AS col0
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_24.slt:23423

External error: query is expected to fail, but actually succeed:
[SQL] SELECT + + 39 * + - COALESCE ( CASE WHEN NOT ( NULL ) <= NULL THEN + ( SUM ( 88 ) ) END, + 53 ) * - 97 * - ( - - 13 ) * 11 AS col0
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_30.slt:28635

External error: query is expected to fail, but actually succeed:
[SQL] SELECT DISTINCT - + 67 * + CASE + COALESCE ( + 29, 44, - - 69 ) WHEN + + COUNT ( * ) + - 7 THEN 37 * - 34 * + - ( + 7 ) * - CAST ( + - 81 AS INTEGER ) + + 66 + SUM ( + 27 ) * - - 89 + 45 + + ( - CASE - - 69 WHEN + - 82 THEN + - 33 / 73 * + 2 WHEN 59 THEN - ( + + 75 ) END ) + 9 + + 47 - + CASE - 76 WHEN NULLIF ( + - NULLIF ( + 21, - MAX ( NULLIF ( + 80, - 47 ) ) + CAST ( NULL AS INTEGER ) ), - 62 ) THEN + 25 WHEN 81 + 59 THEN + 67 + - 3 * - 59 END / + 66 / + 52 ELSE NULL END / - NULLIF ( CASE WHEN NULL IS NULL THEN + SUM ( DISTINCT 30 ) ELSE NULL END * - 17, CAST ( NULL AS INTEGER ) )
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_21.slt:1317

External error: query is expected to fail, but actually succeed:
[SQL] SELECT col0 + - CASE WHEN 64 IS NULL THEN + col0 END FROM tab2 AS cor0 GROUP BY col0
at ../../datafusion-testing/data/sqlite/random/groupby/slt_good_12.slt:2998

External error: query is expected to fail, but actually succeed:
[SQL] SELECT + CASE WHEN NOT ( NULL ) IS NOT NULL THEN + MAX ( DISTINCT + + 99 ) END AS col0
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_36.slt:20722

External error: query is expected to fail, but actually succeed:
[SQL] SELECT - CASE WHEN NOT ( 49 ) IS NULL THEN - - COUNT ( DISTINCT - - 85 ) END AS col0
at ../../datafusion-testing/data/sqlite/random/expr/slt_good_22.slt:28006

External error: query is expected to fail, but actually succeed:
[SQL] SELECT ALL - col2 AS col2 FROM tab1 WHERE NOT NULL <> ( 78 + CASE WHEN NOT ( NULL ) IS NULL THEN + col0 ELSE NULL END ) GROUP BY col1, col2
at ../../datafusion-testing/data/sqlite/random/groupby/slt_good_11.slt:57644

Error: Execution("44 failures")
error: test failed, to rerun pass `-p datafusion-sqllogictest --test sqllogictests`

}

#[test]
fn case_with_scalar_predicate() -> Result<()> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally recommend adding a .slt test for this in addition (and possibly also instead):

--- a/datafusion/sqllogictest/test_files/case.slt
+++ b/datafusion/sqllogictest/test_files/case.slt
@@ -235,3 +235,10 @@ SELECT CASE WHEN a < 5 THEN a + b ELSE b - NVL(a, 0) END FROM foo
 NULL
 NULL
 7
+
+# Reproducer for
+# https://github.com/apache/datafusion/issues/14099
+query I
+SELECT - 79 * + 91 * - COUNT ( * ) * + - 2 * + - NULLIF ( - 49, - COALESCE ( - + 69, - COALESCE ( + COALESCE ( - 20, ( - 18 ) * + COUNT ( * ) + - 93, - CASE 51 WHEN + COUNT ( * ) + 28 THEN 0 ELSE + 29 * + CASE ( 50 ) WHEN - ( - ( CASE WHEN NOT + 37 IS NULL THEN + COUNT ( * ) END ) ) THEN NULL WHEN - 46 + 87 * - 28 THEN 85 WHEN - COUNT ( * ) THEN NULL END END ), COUNT ( * ) - 39 ) * + 22 ) / - COUNT ( * ) )
+----
+-704522

if let ColumnarValue::Array(bit_mask) = when_expr.evaluate(batch)? {

let when_expr_value = when_expr.evaluate(batch)?;
let when_expr_value = match when_expr_value {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is correct, but it effectively expands out the array even when the argument is a constant. We can probably make it more efficient by handling the other case too

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alamb
Copy link
Contributor

alamb commented Jan 16, 2025

I feel like we could commit this PR and update the sqllogictest expected result in a different PR or we could also do it as part of this PR. @Omega359 any preference?

@Omega359
Copy link
Contributor

Omega359 commented Jan 16, 2025

That's a good question @alamb. My concern with doing it in a separate PR is that other PR merges to main will fail until the tests are updated. On the flip side I'm the only one currently that can automatically regenerate the slt files so someone doesn't have to manually fix a few hundred or few thousand test cases (I have hacked up code to do it with a hacked side-branch of the sqllogictest-rs repo to do the generation).

For now I think the best approach is for me to clone this branch, run the regeneration code, push a PR for the datafusion-testing repo, then once that is merged this PR be updated to include the latest hash for datafusion-testing. I'll try and do the regeneration and PR tonight but that may or may not be possible depending on how hard I race my bike tonight :)

Longer term I think we'll need a better solution for this.

@alamb
Copy link
Contributor

alamb commented Jan 16, 2025

Longer term I think we'll need a better solution for this.

Agreed - I did what I do best and filed a ticket to track the work!

@alamb
Copy link
Contributor

alamb commented Jan 16, 2025

I also think we can improve the code a bit more so I made a follow on PR (will mark as ready for review after this merges)

@Omega359
Copy link
Contributor

Status update: I've rerun the slt generation locally, I just need to apply patches for 2 failures, have a look over the updates, then push a PR. Hopefully by EOD today.

@alamb
Copy link
Contributor

alamb commented Jan 17, 2025

Thank you @Omega359

@Omega359
Copy link
Contributor

apache/datafusion-testing#5. Once that is merged @avkirilishin you will need to update the datafusion-testing submodule in your branch to reflect the latest change.

git submodule update --remote --merge
git add datafusion-testing 
git commit -m "Your message here" 
git push

I think should do it.

@alamb
Copy link
Contributor

alamb commented Jan 18, 2025

I took the liberty of merging this PR up from main and updating the datafusion-testing pin per @Omega359 's suggestion in #14156 (comment)

I then ran the tests;

INCLUDE_SQLITE=true nice cargo test --profile release-nonlto --test sqllogictests

And I am happy report it passes successfully:

    Finished `release-nonlto` profile [optimized] target(s) in 0.17s
     Running bin/sqllogictests.rs (target/release-nonlto/deps/sqllogictests-9840fb00d6768257)
Completed in 6 minutes
(venv) andrewlamb@Andrews-MacBook-Pro-2:~/Software/datafusion2$

@alamb alamb merged commit 868fc35 into apache:main Jan 18, 2025
25 checks passed
@alamb
Copy link
Contributor

alamb commented Jan 18, 2025

Thanks again @avkirilishin and @Omega359

@alamb
Copy link
Contributor

alamb commented Jan 18, 2025

My follow on PR is ready to for review now too:

avantgardnerio pushed a commit to coralogix/arrow-datafusion that referenced this pull request Oct 1, 2025
avantgardnerio pushed a commit to coralogix/arrow-datafusion that referenced this pull request Oct 15, 2025
Dandandan added a commit to coralogix/arrow-datafusion that referenced this pull request Nov 17, 2025
* Get working build

* Add pool_size method to MemoryPool (#218) (#230)

* Add pool_size method to MemoryPool

* Fix

* Fmt

Co-authored-by: Daniël Heres <danielheres@gmail.com>

* Respect `IGNORE NULLS` flag in `ARRAY_AGG` (#260) (apache#15544) v48

* Hook for doing distributed `CollectLeft` joins (#269)

* Ignore writer shutdown error (#271)

* ignore writer shutdown error

* cargo check

* Fix  bug in `swap_hash_join` (#278)

* Try and fix swap_hash_join

* Only swap projections when join does not have projections

* just backport upstream fix

* remove println

* Support Duration in min/max agg functions (#283) (apache#15310) v47

* Support Duration in min/max agg functions

* Attempt to fix build

* Attempt to fix build - Fix chrono version

* Revert "Attempt to fix build - Fix chrono version"

This reverts commit fd76fe6.

* Revert "Attempt to fix build"

This reverts commit 9114b86.

---------

Co-authored-by: svranesevic <svranesevic@users.noreply.github.com>

* Fix panics in array_union (#287) (apache#15149) v48

* Drop rust-toolchain

* Fix panics in array_union

* Fix the chrono

* Backport `GroupsAccumulator` for Duration min/max agg (#288) (apache#15322) v47

* Fix array_sort for empty record batch (#290) (apache#15149) v48

* fix: rewrite fetch, skip of the Limit node in correct order (apache#14496) v46

* fix: rewrite fetch, skip of the Limit node in correct order

* style: fix clippy

* Support aliases in ConstEvaluator (apache#14734) (#281) v46

* Support aliases in ConstEvaluator (apache#14734)

Not sure why they are not supported. It seems that if we're not careful,
some transformations can introduce aliases nested inside other expressions.

* Format Cargo.toml

* Preserve the name of grouping sets in SimplifyExpressions (#282) (apache#14888) v46

Whenever we use `recompute_schema` or `with_exprs_and_inputs`,
this ensures that we obtain the same schema.

* Support Duration in min/max agg functions (#284) (apache#15310) v47

Co-authored-by: svranesevic <svranesevic@users.noreply.github.com>

* fix case_column_or_null with nullable when conditions (apache#13886) v45

* fix case_column_or_null with nullable when conditions

* improve sqllogictests for case_column_or_null

---------

Co-authored-by: zhangli20 <zhangli20@kuaishou.com>

* Fix casewhen (apache#14156) v45

* Cherry-pick topk limit pushdown fix (apache#14192) v45

* fix: FULL OUTER JOIN and LIMIT produces wrong results (apache#14338) v45

* fix: FULL OUTER JOIN and LIMIT produces wrong results

* Fix minor slt testing

* fix test

(cherry picked from commit ecc5694)

* Cherry-pick global limit fix (apache#14245) v45

* fix: Limits are not applied correctly (apache#14418) v46

* fix: Limits are not applied correctly

* Add easy fix

* Add fix

* Add slt testing

* Address comments

* Disable grouping set in CSE

* Fix spm + limit (apache#14569) v46

* prost 0.13 / fix parquet dep

* Delete unreliable checks

* Segfault in ByteGroupValueBuilder (#294) (apache#15968) v50

* test to demonstrate segfault in ByteGroupValueBuilder

* check for offset overflow

* clippy

(cherry picked from commit 5bdaeaf)

* Update arrow dependency to include rowid (#295)

* Update arrow version

* Feat: Add fetch to CoalescePartitionsExec (apache#14499) (#298) v46

* add fetch info to CoalescePartitionsExec

* use Statistics with_fetch API on CoalescePartitionsExec

* check limit_reached only if fetch is assigned

Co-authored-by: mertak-synnada <mertak67+synaada@gmail.com>

* Fix `CoalescePartitionsExec` proto serialization (apache#15824) (#299) v48

* add fetch to CoalescePartitionsExecNode

* gen proto code

* Add test

* fix

* fix build

* Fix test build

* remove comments

Co-authored-by: 张林伟 <lewiszlw520@gmail.com>

* Add JoinContext with JoinLeftData to TaskContext in HashJoinExec (#300)

* Add JoinContext with JoinLeftData to TaskContext in HashJoinExec

* Expose random state as const

* re-export ahash::RandomState

* JoinContext default impl

* Add debug log when setting join left data

* Update arrow version for not preserving dict_id (#303)

* Use partial aggregation schema for spilling to avoid column mismatch in GroupedHashAggregateStream (apache#13995) (#302) v45

* Refactor spill handling in GroupedHashAggregateStream to use partial aggregate schema

* Implement aggregate functions with spill handling in tests

* Add tests for aggregate functions with and without spill handling

* Move test related imports into mod test

* Rename spill pool test functions for clarity and consistency

* Refactor aggregate function imports to use fully qualified paths

* Remove outdated comments regarding input batch schema for spilling in GroupedHashAggregateStream

* Update aggregate test to use AVG instead of MAX

* assert spill count

* Refactor partial aggregate schema creation to use create_schema function

* Refactor partial aggregation schema creation and remove redundant function

* Remove unused import of Schema from arrow::datatypes in row_hash.rs

* move spill pool testing for aggregate functions to physical-plan/src/aggregates

* Use Arc::clone for schema references in aggregate functions

(cherry picked from commit 81b50c4)

Co-authored-by: kosiew <kosiew@gmail.com>

* Update tag

* Push limits past windows (#337) (apache#17347) v50

* Restore old method for DQE

* feat(optimizer): Enable filter pushdown on window functions (apache#14026) v45

* Avoid Aliased Window Expr Enter Unreachable Code (apache#14109) v45

(cherry picked from commit fda500a)

* Use `Expr::qualified_name()` and `Column::new()` to extract partition keys from window and aggregate operators (#355) (apache#17757) v51

* Update PR template to be relevant to our fork

* Make limit pushdown work for SortPreservingMergeExec (apache#17893) (#361)

* re-publicise functions DQE relies on

* Handle columns in with_new_exprs with a Join (apache#15055) (#384)

apache#15055

* handle columns in with_new_exprs with Join

* test doesn't return result

* take join from result

* clippy

* make test fallible

* accept any pair of expression for new_on in with_new_exprs for Join

* use with_capacity

Co-authored-by: delamarch3 <68732277+delamarch3@users.noreply.github.com>

---------

Co-authored-by: Georgi Krastev <georgi.krastev@coralogix.com>
Co-authored-by: Daniël Heres <danielheres@gmail.com>
Co-authored-by: Dan Harris <1327726+thinkharderdev@users.noreply.github.com>
Co-authored-by: Faiaz Sanaulla <105630300+fsdvh@users.noreply.github.com>
Co-authored-by: Sava Vranešević <20240220+svranesevic@users.noreply.github.com>
Co-authored-by: svranesevic <svranesevic@users.noreply.github.com>
Co-authored-by: Yingwen <realevenyag@gmail.com>
Co-authored-by: Zhang Li <richselian@gmail.com>
Co-authored-by: zhangli20 <zhangli20@kuaishou.com>
Co-authored-by: Aleksey Kirilishin <54231417+avkirilishin@users.noreply.github.com>
Co-authored-by: xudong.w <wxd963996380@gmail.com>
Co-authored-by: Qi Zhu <821684824@qq.com>
Co-authored-by: Martins Purins <martins.purins@coralogix.com>
Co-authored-by: mertak-synnada <mertak67+synaada@gmail.com>
Co-authored-by: 张林伟 <lewiszlw520@gmail.com>
Co-authored-by: kosiew <kosiew@gmail.com>
Co-authored-by: nuno-faria <nunofpfaria@gmail.com>
Co-authored-by: Berkay Şahin <124376117+berkaysynnada@users.noreply.github.com>
Co-authored-by: Mason Hall <mason.hall@coralogix.com>
Co-authored-by: delamarch3 <68732277+delamarch3@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

physical-expr Changes to the physical-expr crates

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sqlite test query results in Internal error: predicate did not evaluate to an array

3 participants