Skip to content

Use TableReference for TableScan#5615

Merged
alamb merged 8 commits intoapache:mainfrom
alamb:alamb/owned_table_reference
Mar 16, 2023
Merged

Use TableReference for TableScan#5615
alamb merged 8 commits intoapache:mainfrom
alamb:alamb/owned_table_reference

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented Mar 15, 2023

Which issue does this PR close?

Closes #5522

Rationale for this change

After #5343 some APIs take a String and parse it like a table reference (aka apply sql normalization rules to it)

What changes are included in this PR?

TBD

Are these changes tested?

Yes, both existing tests as well as new unit tests

Are there any user-facing changes?

Theoretically yes, but he existing LogicalPlanBuilder::scan("tablename", ...) will continue to work so no major API work will be needed

@alamb alamb added the api change Changes the API exposed to users of the crate label Mar 15, 2023
@github-actions github-actions bot added core Core DataFusion crate logical-expr Logical plan and expressions optimizer Optimizer rules sql SQL Planner substrait Changes to the substrait crate labels Mar 15, 2023
@alamb alamb force-pushed the alamb/owned_table_reference branch from d213758 to a9dfd29 Compare March 15, 2023 18:14
@alamb alamb marked this pull request as ready for review March 15, 2023 18:59
@alamb
Copy link
Contributor Author

alamb commented Mar 15, 2023

cc @Jefffrey I wonder if you have time to review this PR?

Copy link
Contributor

@Jefffrey Jefffrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for this, left a few nitpicks/comments

let plan = LogicalPlanBuilder::from(sub_plan.clone())
.project(projection_exprs)?
.alias(table_name)?;
.alias(table_name.to_string())?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will subquery_alias need similar treatment to have an OwnedTableReference instead of String? or String itself should be just fine?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is actually important that the subquery alias is a non parsed string here (otherwise tests fail).

This rewrite is effectively inlining the view defition as a named subquery (that doesn't make sense to be in a schema):

...
FROM <view definition> as "table_name"

I will add a comment here.

Comment on lines +669 to +670
pub fn from_qualified<'a>(
qualifier: impl Into<TableReference<'a>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im curious about why explicit lifetime is needed here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically because the compiler told me to do so 😆

But seriously, I think the reason is that impl doesn't support anonymous lifetimes -- I would have to change this to use the explicit generic syntax

Here is what happens if I remove <'a>:

   --> datafusion/common/src/dfschema.rs:113:45
    |
113 |         qualifier: impl Into<TableReference<'_>>,
    |                                             ^^ expected named lifetime parameter
    |
help: consider introducing a named lifetime parameter
    |
112 ~     pub fn try_from_qualified_schema<'a>(
113 ~         qualifier: impl Into<TableReference<'a>>,
    |

}

fn test_plan() -> LogicalPlan {
let table_ref: Option<TableReference<'_>> = None;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use TableReference::none() that you introduced?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent call. Done in 2c3b6d7

let table_source = table_source(table_schema);
let name = name
.map(|n| n.into())
.unwrap_or_else(|| OwnedTableReference::bare(UNNAMED_TABLE))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe TableReference::bare(UNNAMED_TABLE) instead since next line is to_owned_reference() anyway?

Copy link
Contributor Author

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Jefffrey -- the review was really helpful.

Comment on lines +669 to +670
pub fn from_qualified<'a>(
qualifier: impl Into<TableReference<'a>>,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically because the compiler told me to do so 😆

But seriously, I think the reason is that impl doesn't support anonymous lifetimes -- I would have to change this to use the explicit generic syntax

Here is what happens if I remove <'a>:

   --> datafusion/common/src/dfschema.rs:113:45
    |
113 |         qualifier: impl Into<TableReference<'_>>,
    |                                             ^^ expected named lifetime parameter
    |
help: consider introducing a named lifetime parameter
    |
112 ~     pub fn try_from_qualified_schema<'a>(
113 ~         qualifier: impl Into<TableReference<'a>>,
    |

}

fn test_plan() -> LogicalPlan {
let table_ref: Option<TableReference<'_>> = None;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent call. Done in 2c3b6d7

let plan = LogicalPlanBuilder::from(sub_plan.clone())
.project(projection_exprs)?
.alias(table_name)?;
.alias(table_name.to_string())?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is actually important that the subquery alias is a non parsed string here (otherwise tests fail).

This rewrite is effectively inlining the view defition as a named subquery (that doesn't make sense to be in a schema):

...
FROM <view definition> as "table_name"

I will add a comment here.

Copy link
Contributor

@Jefffrey Jefffrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks

@alamb
Copy link
Contributor Author

alamb commented Mar 16, 2023

Thanks again for the review @Jefffrey -- since this is blocking me downstream (in IOx, where we are basically playing the role of early adopters / testers) I plan to merge this PR in later today.

@alamb alamb merged commit 4afd67a into apache:main Mar 16, 2023
@alamb alamb deleted the alamb/owned_table_reference branch March 16, 2023 13:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api change Changes the API exposed to users of the crate core Core DataFusion crate logical-expr Logical plan and expressions optimizer Optimizer rules sql SQL Planner substrait Changes to the substrait crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Change logical plan TableScan to use OwnedTableReference for table_name

2 participants