Migrate datafusion/sql tests to insta, part6 #15578

qstommyshu · 2025-04-04T14:08:35Z

Which issue does this PR close?

Related Migrate datafusion/sql tests to insta #15397
Related Migrate datafusion/sql tests to insta,sql integrations #15484, Migrate datafusion/sql tests to insta, part1 #15497, Migrate datafusion/sql tests to insta, part2 #15499, Migrate datafusion/sql tests to insta, part3 #15533, Migrate datafusion/sql tests to insta, part4 #15548 , Migrate datafusion/sql tests to insta, part5 #15567
this is a part of Migrate datafusion/sql tests to insta,sql integrations #15484 breaking down.
Checkout things to note of the whole migration in comments section of Migrate datafusion/sql tests to insta,sql integrations #15484.

Rationale for this change

What changes are included in this PR?

This is the part 5 of #15484 breakdown, as the code changes in #15484 is too large. Part1: #15497, Part2: #15499, Part3: #15533, Part4: 15548, Part5: #15567

Are these changes tested?

Yes, I manually tested the before/after changes.

Are there any user-facing changes?

No

…snapshot testing

qstommyshu · 2025-04-04T19:23:51Z

This is the last PR for #15397

The migration is mostly done. There is only one really big test roundtrip_statement_with_dialect() that I think it would be too big to break down into individual test cases with assert_snapshot! inline assertion (I'm thinking if there is a way to to break it down without increasing 1000+ lines of test code).

Please let me know if you have any suggestion @alamb @blaginin .

qstommyshu · 2025-04-04T19:32:26Z

Also, I found the test cases are a bit here and there, but I think the sql test suite are too large to restructure now. I'm thinking maybe we can try to use modules to separate different tests in future?

Here is an example:

pub fn add(a: i32, b: i32) -> i32 {
    a + b
}

pub fn multiply(a: i32, b: i32) -> i32 {
    a * b
}

#[cfg(test)]
mod tests {
    use super::*;

    // add test suite
    mod add_tests {
        use super::*;

        #[test]
        fn test_add_positive() {
            assert_eq!(add(2, 3), 5);
        }

        #[test]
        fn test_add_negative() {
            assert_eq!(add(-2, -3), -5);
        }
    }

    // multiply test suite
    mod multiply_tests {
        use super::*;

        #[test]
        fn test_multiply_positive() {
            assert_eq!(multiply(4, 5), 20);
        }

        #[test]
        fn test_multiply_mixed() {
            assert_eq!(multiply(-3, 6), -18);
        }
    }

    // integration test suite
    mod integration_tests {
        use super::*;

        #[test]
        fn test_combined_operations() {
            let sum = add(2, 3);
            let product = multiply(sum, 4);
            assert_eq!(product, 20);
        }
    }
}

…ot assertions with `insta`

…snapshot assertions

qstommyshu · 2025-04-06T12:26:08Z

Hi @alamb and @blaginin

I found a way to migrate roundtrip_statement_with_dialect() now, just want to confirm if you like to proceed with it.

What I did below is to incorporate all core test logics into a macro, and try to minimize the lines of code (so that we don't have too much lines of code increase in this migration) for each test case but still expressive, by adding the variable name before each variable in the macro (because rust analyzer doesn't indicate types for macro).

#[macro_export]
macro_rules! roundtrip_statement_with_dialect_helper {
    (
        query: $sql:expr,
        parser_dialect: $parser_dialect:expr,
        unparser_dialect: $unparser_dialect:expr,
        expected: @ $expected:literal $(,)?
    ) => {{
        let statement = Parser::new(&$parser_dialect)
            .try_with_sql($sql)?
            .parse_statement()?;

        let state = MockSessionState::default()
            .with_aggregate_function(max_udaf())
            .with_aggregate_function(min_udaf())
            .with_expr_planner(Arc::new(CoreFunctionPlanner::default()))
            .with_expr_planner(Arc::new(NestedFunctionPlanner));

        let context = MockContextProvider { state };
        let sql_to_rel = SqlToRel::new(&context);
        let plan = sql_to_rel
            .sql_statement_to_plan(statement)
            .unwrap_or_else(|e| panic!("Failed to parse sql: {}\n{e}", $sql));

        let unparser = Unparser::new(&$unparser_dialect);
        let roundtrip_statement = unparser.plan_to_sql(&plan)?;

        let actual = &roundtrip_statement.to_string();
        insta::assert_snapshot!(actual, @ $expected);
    }};
}

// each test case would look like this
#[test]
fn roundtrip_statement_with_dialect_1() -> Result<(), DataFusionError> {
    roundtrip_statement_with_dialect_helper!(
        query: "select min(ta.j1_id) as j1_min from j1 ta order by min(ta.j1_id) limit 10;",
        parser_dialect: MySqlDialect {},
        unparser_dialect: UnparserMySqlDialect {},
        expected: @"SELECT `j1_min` FROM (SELECT min(`ta`.`j1_id`) AS `j1_min`, min(`ta`.`j1_id`) FROM `j1` AS `ta` ORDER BY min(`ta`.`j1_id`) ASC) AS `derived_sort` LIMIT 10",
    );
    Ok(())
}

Note: we can't put this macro in a for loop because inline assert_snapshot!($actual, @ $expected) only takes literal as expected argument, so we can't store the expected value in a variable and then pass to the macro (expr != literal).

And I feel like this is not a standard way to write test, not sure if I should go head and do it or not. Please let me know if you want to proceed with this way.

alamb

This looks great to me -- thank you @qstommyshu

I had some minor suggestions but nothing required in my opinion

cc @blaginin

datafusion/sql/tests/cases/diagnostic.rs

datafusion/sql/tests/cases/plan_to_sql.rs

qstommyshu · 2025-04-06T14:02:49Z

Hi @alamb and @blaginin

I found a way to migrate roundtrip_statement_with_dialect() now, just want to confirm if you like to proceed with it.

What I did below is to incorporate all core test logics into a macro, and try to minimize the lines of code (so that we don't have too much lines of code increase in this migration) for each test case but still expressive, by adding the variable name before each variable in the macro (because rust analyzer doesn't indicate types for macro).

#[macro_export]
macro_rules! roundtrip_statement_with_dialect_helper {
    (
        query: $sql:expr,
        parser_dialect: $parser_dialect:expr,
        unparser_dialect: $unparser_dialect:expr,
        expected: @ $expected:literal $(,)?
    ) => {{
        let statement = Parser::new(&$parser_dialect)
            .try_with_sql($sql)?
            .parse_statement()?;

        let state = MockSessionState::default()
            .with_aggregate_function(max_udaf())
            .with_aggregate_function(min_udaf())
            .with_expr_planner(Arc::new(CoreFunctionPlanner::default()))
            .with_expr_planner(Arc::new(NestedFunctionPlanner));

        let context = MockContextProvider { state };
        let sql_to_rel = SqlToRel::new(&context);
        let plan = sql_to_rel
            .sql_statement_to_plan(statement)
            .unwrap_or_else(|e| panic!("Failed to parse sql: {}\n{e}", $sql));

        let unparser = Unparser::new(&$unparser_dialect);
        let roundtrip_statement = unparser.plan_to_sql(&plan)?;

        let actual = &roundtrip_statement.to_string();
        insta::assert_snapshot!(actual, @ $expected);
    }};
}

// each test case would look like this
#[test]
fn roundtrip_statement_with_dialect_1() -> Result<(), DataFusionError> {
    roundtrip_statement_with_dialect_helper!(
        query: "select min(ta.j1_id) as j1_min from j1 ta order by min(ta.j1_id) limit 10;",
        parser_dialect: MySqlDialect {},
        unparser_dialect: UnparserMySqlDialect {},
        expected: @"SELECT `j1_min` FROM (SELECT min(`ta`.`j1_id`) AS `j1_min`, min(`ta`.`j1_id`) FROM `j1` AS `ta` ORDER BY min(`ta`.`j1_id`) ASC) AS `derived_sort` LIMIT 10",
    );
    Ok(())
}

Note: we can't put this macro in a for loop because inline assert_snapshot!($actual, @ $expected) only takes literal as expected argument, so we can't store the expected value in a variable and then pass to the macro (expr != literal).

And I feel like this is not a standard way to write test, not sure if I should go head and do it or not. Please let me know if you want to proceed with this way.

Just want to confirm, do I need to migrate the big test case roundtrip_statement_with_dialect() ?

alamb · 2025-04-06T19:08:03Z

Just want to confirm, do I need to migrate the big test case roundtrip_statement_with_dialect() ?

IN my opinion, it is is note required. If you want to rewrite it I am sure it would be appreciated by future contributors

Another pattern is to put the entire test into a structure that has a Display impl and then print out that display impl in the snapshot. I did something like this in pydantic#16

So instead of

// each test case would look like this
#[test]
fn roundtrip_statement_with_dialect_1() -> Result<(), DataFusionError> {
    roundtrip_statement_with_dialect_helper!(
        query: "select min(ta.j1_id) as j1_min from j1 ta order by min(ta.j1_id) limit 10;",
        parser_dialect: MySqlDialect {},
        unparser_dialect: UnparserMySqlDialect {},
        expected: @"SELECT `j1_min` FROM (SELECT min(`ta`.`j1_id`) AS `j1_min`, min(`ta`.`j1_id`) FROM `j1` AS `ta` ORDER BY min(`ta`.`j1_id`) ASC) AS `derived_sort` LIMIT 10",
    );
    Ok(())
}

struct TestResult { 
  input_query: String,
  parser_dialect: &Dialect,
  unparser_dialect: &Dialect,
  output_query: String 
}

impl TestResult {
  fn try_new(
  input_query: String,
  parser_dialect: &Dialect,
  unparser_dialect: &Dialect,
  ) -> Result<Self> { 
   // do unparse , etc
  }
}

impl Display for TestResult { ... }

// tests make a TestResult and compare the string result
#[test]
fn roundtrip_statement_with_dialect_1() -> Result<(), DataFusionError> {
    let result =


    insta::assert_snapshot!(
        TestResult::try_new(
          select min(ta.j1_id) as j1_min from j1 ta order by min(ta.j1_id) limit 10;",
          MySqlDialect {},
           UnparserMySqlDialect {},
        },
        @r"
        query: "select min(ta.j1_id) as j1_min from j1 ta order by min(ta.j1_id) limit 10;",
        parser_dialect: MySqlDialect {},
        unparser_dialect: UnparserMySqlDialect {},
        output "SELECT `j1_min` FROM (SELECT min(`ta`.`j1_id`) AS `j1_min`, min(`ta`.`j1_id`) FROM `j1` AS `ta` ORDER BY min(`ta`.`j1_id`) ASC) AS `derived_sort` LIMIT 10",
         );
}
}

qstommyshu · 2025-04-07T02:01:03Z

Got it, Thanks @alamb for your thoughts.

I will update roundtrip_statement_with_dialect(). I will probably go with the macro approach because the macro approach is essentially the same as the original test, so I can almost follow the original TestStatementWithDialect structure. Plus, I don't need to create a new struct and impl traits.

I can foresee the code changes will be large, so I will leave it to the next PR (part 7, should be the real last part). Please merge this PR if you think the change is good.

alamb · 2025-04-07T11:03:00Z

Thanks @qstommyshu !

* WIP * migrate tests in `diagnostic.rs` to `insta` * Migrate simple test case in `plan_to_sql.rs` to `insta` * Migrate `sql_round_trip` test cases in `plan_to_sql.rs` to `insta` * Migrate table reference tests in `plan_to_sql.rs` to use `insta` for snapshot testing * Refactor roundtrip expression tests in `plan_to_sql.rs` to use snapshot assertions with `insta` * Update roundtrip expression tests in `plan_to_sql.rs` to use updated snapshot assertions * resolve comments

WIP

69febef

github-actions bot added the sql SQL Planner label Apr 4, 2025

qstommyshu added 4 commits April 4, 2025 11:08

migrate tests in diagnostic.rs to insta

b403c33

Migrate simple test case in plan_to_sql.rs to insta

223b300

Migrate sql_round_trip test cases in plan_to_sql.rs to insta

90bffea

Migrate table reference tests in plan_to_sql.rs to use insta for …

a70351a

…snapshot testing

qstommyshu marked this pull request as ready for review April 4, 2025 19:18

qstommyshu added 2 commits April 4, 2025 21:34

Refactor roundtrip expression tests in plan_to_sql.rs to use snapsh…

59ab33a

…ot assertions with `insta`

Update roundtrip expression tests in plan_to_sql.rs to use updated …

3dcce77

…snapshot assertions

alamb approved these changes Apr 6, 2025

View reviewed changes

datafusion/sql/tests/cases/diagnostic.rs Show resolved Hide resolved

datafusion/sql/tests/cases/plan_to_sql.rs Show resolved Hide resolved

datafusion/sql/tests/cases/plan_to_sql.rs Show resolved Hide resolved

datafusion/sql/tests/cases/plan_to_sql.rs Show resolved Hide resolved

resolve comments

2d93654

alamb merged commit 4ccd455 into apache:main Apr 7, 2025
28 checks passed

qstommyshu mentioned this pull request Apr 7, 2025

Migrate datafusion/sql tests to insta, part7 #15621

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Migrate datafusion/sql tests to insta, part6 #15578

Migrate datafusion/sql tests to insta, part6 #15578

Uh oh!

qstommyshu commented Apr 4, 2025 •

edited

Loading

Uh oh!

qstommyshu commented Apr 4, 2025 •

edited

Loading

Uh oh!

qstommyshu commented Apr 4, 2025 •

edited

Loading

Uh oh!

qstommyshu commented Apr 6, 2025 •

edited

Loading

Uh oh!

alamb left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

qstommyshu commented Apr 6, 2025

Uh oh!

alamb commented Apr 6, 2025

Uh oh!

qstommyshu commented Apr 7, 2025

Uh oh!

Uh oh!

alamb commented Apr 7, 2025

Uh oh!

Uh oh!

Migrate datafusion/sql tests to insta, part6 #15578

Migrate datafusion/sql tests to insta, part6 #15578

Uh oh!

Conversation

qstommyshu commented Apr 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

qstommyshu commented Apr 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

qstommyshu commented Apr 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

qstommyshu commented Apr 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

qstommyshu commented Apr 6, 2025

Uh oh!

alamb commented Apr 6, 2025

Uh oh!

qstommyshu commented Apr 7, 2025

Uh oh!

Uh oh!

alamb commented Apr 7, 2025

Uh oh!

Uh oh!

qstommyshu commented Apr 4, 2025 •

edited

Loading

qstommyshu commented Apr 4, 2025 •

edited

Loading

qstommyshu commented Apr 4, 2025 •

edited

Loading

qstommyshu commented Apr 6, 2025 •

edited

Loading