Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support recursive CTEs logical plans in datafusion-proto #13314

Merged
merged 6 commits into from
Nov 12, 2024

Conversation

leoyvens
Copy link
Contributor

@leoyvens leoyvens commented Nov 8, 2024

Rationale for this change

Fixes a gap in support for logical plan serialization to protobuf.

What changes are included in this PR?

Adds support for serializing LogicaPlan::RecursiveQuery to protobuf.

Are these changes tested?

Yes, there is an unit test.

Are there any user-facing changes?

This feature is user visible. Using it effectively is probably tricky due to the need to decode the provider for the CTE work table in the LogicalExtensionCodec. But still, this is a start.

@github-actions github-actions bot added the proto Related to proto crate label Nov 8, 2024
assert_eq!(format!("{plan:?}"), format!("{logical_round_trip:?}"));
assert_eq!(
format!("{}", pretty_format_batches(&output).unwrap()),
format!("{}", pretty_format_batches(&output_round_trip).unwrap())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assert_eq looks incorrect. output_round_trip and output are both the result of ctx.sql(query).
I think output_round_trip should be

let output_round_trip = ctx
        .execute_logical_plan(logical_round_trip)
        .await
        .unwrap()
        .collect()
        .await
        .unwrap()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're absolutely right, and the corrected test was in fact failing! Despite the logical plans being equal, outputs were different.

_node: Arc<dyn TableProvider>,
_buf: &mut Vec<u8>,
) -> Result<(), DataFusionError> {
Ok(())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that we need to encode CteWorkTable and then recreate a CteWorkTable when decoding.🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a great idea, I was able to implemented this correctly, without needing LogicalExtensionCodec by following your suggestion.

@leoyvens leoyvens force-pushed the datafusion-proto-recursive-query branch from f647230 to d0401cf Compare November 11, 2024 17:56
@github-actions github-actions bot added the core Core DataFusion crate label Nov 11, 2024
@leoyvens
Copy link
Contributor Author

@jonahgao thank you so much for your review, this PR is in much better shape now.

Copy link
Member

@jonahgao jonahgao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM👍, thanks @leoyvens

I merged main to get CI passed and use Arc::clone to avoid conflict with #13338

@jonahgao jonahgao merged commit cd69e37 into apache:main Nov 12, 2024
25 checks passed
@leoyvens leoyvens deleted the datafusion-proto-recursive-query branch November 12, 2024 10:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate proto Related to proto crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants