Skip to content

Conversation

@qstommyshu
Copy link
Contributor

@qstommyshu qstommyshu commented Mar 26, 2025

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Migrated tests in datafusion/core/tests/subtrait to use insta.

Are these changes tested?

Yes, I manually tested the before/after changes.

Are there any user-facing changes?

No.

@github-actions github-actions bot added the substrait Changes to the substrait crate label Mar 26, 2025
@qstommyshu qstommyshu changed the title add cargo insta to dev dependencies Migrate subtrait tests to insta Mar 26, 2025
@qstommyshu
Copy link
Contributor Author

qstommyshu commented Mar 26, 2025

Just to clarify this PR is NOT FINISHED YET. I need more clarification of the scope of this issue.

Update:
I think I know what files to change now.

@qstommyshu
Copy link
Contributor Author

qstommyshu commented Mar 27, 2025

Hi @alamb and @blaginin ,

This PR is mostly done. There are still some areas (non-trivial test cases migration) where I'm not sure if I should or how should I migrate them to insta. For example:

The assert_expected_plan in roundtrip_logical_plan.rs accepts a hard-coded expected_plan_str, and uses assert_eq! internally. I wanted to replace the assert_eq! with assert_snapshot!, but Rust function does not accept insta inline snapshot syntax like @r, meaning that I can't pass a raw string to assert_expected_plan to perform a inline snapshot assertion.

And there are many more functions like this, I can't simply change them as they accepts dynamically generated plan_str and plan.schema(). Changing them into assert_snapshot! with inline snapshot may change their original behaviour.

Just want to know if I need to migrate those non-trivial test cases as well, thanks.

@alamb
Copy link
Contributor

alamb commented Mar 27, 2025

And there are many more functions like this, I can't simply change them as they accepts dynamically generated plan_str and plan.schema(). Changing them into assert_snapshot! with inline snapshot may change their original behaviour.

Just want to know if I need to migrate those non-trivial test cases as well, thanks.

In general I suggest we get the basic easy to port tests done and then work on the others as a follow on PR

For those changes, I think you would have to restructure the tests

From

        let expected_plan_str = format!(
            "Projection: count(Int64(1)) AS {syntax}\
        \n  Aggregate: groupBy=[[]], aggr=[[count(Int64(1))]]\
        \n    Projection: \
        \n      LeftSemi Join: data.a = data2.a\
        \n        Aggregate: groupBy=[[data.a]], aggr=[[]]\
        \n          TableScan: data projection=[a]\
        \n        TableScan: data2 projection=[a]"
        );

        assert_expected_plan(
            &format!("SELECT {syntax} FROM (SELECT data.a FROM data INTERSECT SELECT data2.a FROM data2);"),
            &expected_plan_str,
            true
        ).await

Perhaps something more like

         assert_snapshot!(
        do_conversion(&format!("SELECT {syntax} FROM (SELECT data.a FROM data INTERSECT SELECT data2.a FROM data2);),
        @r#"
            "Projection: count(Int64(1)) AS {syntax}\
        \n  Aggregate: groupBy=[[]], aggr=[[count(Int64(1))]]\
        \n    Projection: \
        \n      LeftSemi Join: data.a = data2.a\
        \n        Aggregate: groupBy=[[data.a]], aggr=[[]]\
        \n          TableScan: data projection=[a]\
        \n        TableScan: data2 projection=[a]"
        );
            "#
                );
        Ok(())

Where do_conversion returns a string and does what assert-epected_plan does, something like

async fn do_conversion(
    sql: &str,
    assert_schema: bool,
) -> Result<String> {
    let ctx = create_context().await?;
    let df = ctx.sql(sql).await?;
    let plan = df.into_optimized_plan()?;
    let proto = to_substrait_plan(&plan, &ctx.state())?;
    let plan2 = from_substrait_plan(&ctx.state(), &proto).await?;
    let plan2 = ctx.state().optimize(&plan2)?;

    if assert_schema {
        assert_eq!(plan.schema(), plan2.schema());
    }

    Ok(format!("{plan2}");)
}

@blaginin
Copy link
Collaborator

Closes #15398.

If you plan to do more prs, consider changing it to "related", so the ticket won't get auto closed

@qstommyshu qstommyshu changed the title Migrate subtrait tests to insta Migrate subtrait tests to insta, part1 Mar 28, 2025
@qstommyshu
Copy link
Contributor Author

In general I suggest we get the basic easy to port tests done and then work on the others as a follow on PR

Got it, thanks @alamb and @blaginin for the help.

I think I've finished migrating all the "easy" portion now, please review this PR. The only thing left are the non-trivial cases in roundtrip_logical_plan.rs.

I will create another PR for that as suggested.

@qstommyshu qstommyshu requested a review from blaginin March 28, 2025 02:17
Copy link
Collaborator

@blaginin blaginin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks 🙏

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank ou @qstommyshu and @blaginin

@alamb alamb merged commit cd96b26 into apache:main Mar 28, 2025
29 checks passed
qstommyshu added a commit to qstommyshu/datafusion that referenced this pull request Mar 28, 2025
* add `cargo insta` to dev dependencies

* migrate `consumer_intergration.rs` tests to `insta`

* Revert "migrate `consumer_intergration.rs` tests to `insta`"

This reverts commit c3be2eb.

* migrate `consumer_integration.rs` to `insta` inline snapshot

* migrate logical plans tests to use `insta` snapshots

* migrate emit_kind_tests to use `insta` snapshots

* migrate function_test to use `insta` snapshots for assertions

* migrate substrait_validations tests to use insta snapshots, missing `insta` mapping to `assert!`

* revert `handle_emit_as_project_without_volatile_exprs` back to `assert_eq!` and remove `format!` for `assert_snapshot!`

* migrate function and validation tests to use plan directly in assert_snapshot!

* migrate serialize tests to use insta snapshots for assertions

* migrate logical_plans test to use insta snapshots for assertions
@qstommyshu qstommyshu changed the title Migrate subtrait tests to insta, part1 Migrate subtraits tests to insta, part1 Apr 5, 2025
nirnayroy pushed a commit to nirnayroy/datafusion that referenced this pull request May 2, 2025
* add `cargo insta` to dev dependencies

* migrate `consumer_intergration.rs` tests to `insta`

* Revert "migrate `consumer_intergration.rs` tests to `insta`"

This reverts commit c3be2eb.

* migrate `consumer_integration.rs` to `insta` inline snapshot

* migrate logical plans tests to use `insta` snapshots

* migrate emit_kind_tests to use `insta` snapshots

* migrate function_test to use `insta` snapshots for assertions

* migrate substrait_validations tests to use insta snapshots, missing `insta` mapping to `assert!`

* revert `handle_emit_as_project_without_volatile_exprs` back to `assert_eq!` and remove `format!` for `assert_snapshot!`

* migrate function and validation tests to use plan directly in assert_snapshot!

* migrate serialize tests to use insta snapshots for assertions

* migrate logical_plans test to use insta snapshots for assertions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

substrait Changes to the substrait crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants