Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add explain TableFunction for substrait #114

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

drin
Copy link
Member

@drin drin commented Oct 29, 2024

This adds a new table function, explain_substrait, as well as a new execution kernel for the function and binding kernel. This reuses the FromSubstraitFunctionData structure since it contains all the pieces we would need.

The reason this is useful is because explain naively returns the table function itself instead of the plan it may execute. In the case of executing substrait (e.g. from_substrait(...)), this would be a single operator for the from_substrait function itself. If, instead one calls explain_substrait(...), the result would be an explain plan for what the substrait extension translates into duckdb (the result of SubstraitPlanToDuckDBRel)

@drin
Copy link
Member Author

drin commented Oct 29, 2024

I realized this functionality would be generally useful. For now, this is mostly a placeholder until I can maybe test this at the end of the week and write some tests, etc.

@pdet
Copy link
Contributor

pdet commented Oct 30, 2024

Hey Aldrin, thanks for the PR, I think it looks good!

Could you just add a few tests?

@jacques-n
Copy link
Contributor

@drin , can you rebase and add some tests?

Thanks

This adds a new table function, `explain_substrait`, as well as a new
execution kernel for the function and binding kernel. This reuses the
`FromSubstraitFunctionData` structure since it contains all the pieces
we would need.

In a rebase, we re-add `FromSubstraitFunctionData` and we allow
ourselves to rename some functions since others we were matching the
style of have been removed. Changes upstream changed TableFunctions to
return a TableRef instead of a QueryResult, but to return an explain
plan I believe we still need to return a QueryResult (unless we
construct and return an `ExplainRelation` and call `GetTableRef` on
that).
I had to re-add FromSubstraitFunctionData, so in the process I forgot to
return unique_ptr<FunctionData> instead.

Also re-added missing closing brace.

This also adds a test, but I haven't been able to actually test it.
Additionally, it seems that calling explain is truncating the result, so
I need to figure out how to prevent that from happening in order to
create a proper expected output (and to make the function itself
useful).
@drin
Copy link
Member Author

drin commented Nov 14, 2024

99752e4 is the rebase, and 5707b53 fixes a few issues and adds a test.

Unfortunately, I am developing with protobuf v28.3 and modified my duckdb to link against mohair-substrait: so, it is difficult for me to actually test the changes.

at the moment, I'm getting the following error when trying to run against a mostly accurate version of this PR:

 symbol not found in flat namespace '__ZN6google8protobuf8internal26fixed_address_empty_stringE'

I can continue trying to tie this off in a few days once I reach a good checkpoint with other work.

@drin
Copy link
Member Author

drin commented Nov 14, 2024

@pdet I was wondering if ExplainRelation::GetTableRef would return a TableRef that would eventually resolve to an equivalent QueryResult as Relation::Explain does?

In other words, I am currently returning a QueryResult via substrait_extension.cpp#L312 and using a normal TableFunction (uses a normal bind function). I'm wondering if I could construct an ExplainRelation similarly to how the root ProjectionRelation is constructed in TransformRootOp, and then I can call GetTableRef on that as you do for the other substrait functions (e.g. from_substrait).

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants