Skip to content

Conversation

@alamb
Copy link

@alamb alamb commented Mar 6, 2025

Which issue does this PR close?

Rationale for this change

@berkaysynnada and I agree that removing these tests would likely result in a lack of coverage. See https://github.com/apache/datafusion/pull/15012/files#r1982223008

What changes are included in this PR?

ports the tests from unwrap_cast to live along side the code in the simplifier

I purposely kept the commits separate to make the review easier

  • Copy/paste the tests
  • Update the tests to get them to compile (very small change to use Simplifier api)
  • update the tests that needed to be changed (I will call this out inline)

Are these changes tested?

All tests, no code change

Are there any user-facing changes?

schema: Arc::clone(schema),
};
expr.rewrite(&mut expr_rewriter).data().unwrap()
let props = ExecutionProps::new();
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original tests looked like the following

    fn optimize_test(expr: Expr, schema: &DFSchemaRef) -> Expr {
        let mut expr_rewriter = UnwrapCastExprRewriter {
            schema: Arc::clone(schema),
        };
        expr.rewrite(&mut expr_rewriter).data().unwrap()
    }

I changed optimize_test to use ExprSimplifier instead and most of the test just pass unchanged

// cast(INT8(NULL), INT32) < INT32(12) => INT8(NULL) < INT8(12) => BOOL(NULL)
let lit_lt_lit = cast(null_i8(), DataType::Int32).lt(lit(12i32));
let expected = null_i8().lt(lit(12i8));
let expected = null_bool();
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simplifier simplifies this expression to a constant

// arrow_cast('value', 'Dictionary<Int32, Utf8>') = cast(str1 as Dictionary<Int32, Utf8>) => Utf8('value1') = str1
let expr_input = lit(dict.clone()).eq(cast(col("str1"), dict.data_type()));
let expected = lit("value").eq(col("str1"));
let expected = col("str1").eq(lit("value"));
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expr simplifier puts the expressions in canonical order so this gets changed

let expr_lt =
cast(col("c1"), DataType::Int64).in_list(vec![lit(12i64), lit(24i64)], false);
let expected = col("c1").in_list(vec![lit(12i32), lit(24i32)], false);
// INT32(C1) IN (INT32(12),INT64(23),INT64(34),INT64(56),INT64(78)) ->
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExprSimplifier rewrites small IN lists to BooleanExpr so I had to increase the size of the lists in these tests to avoid that rewrite

@jayzhan211 jayzhan211 merged commit df9f60c into jayzhan211:mv-unwrap-cast-to-sim-expr Mar 6, 2025
24 of 25 checks passed
@alamb alamb deleted the alamb/port_tests branch March 6, 2025 16:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants