Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reject CREATE TABLE/VIEW with duplicate column names #13517

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Rename builders' new() to try_new()
  • Loading branch information
findepi committed Nov 28, 2024
commit e0ecbb86d4d4c86e6f7cef5742dc09719df54c9c
18 changes: 9 additions & 9 deletions datafusion/expr/src/logical_plan/ddl.rs
Original file line number Diff line number Diff line change
Expand Up @@ -291,7 +291,7 @@ impl PartialOrd for CreateExternalTable {
}

impl CreateExternalTable {
pub fn new(fields: CreateExternalTableFields) -> Result<Self> {
pub fn try_new(fields: CreateExternalTableFields) -> Result<Self> {
let CreateExternalTableFields {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonahgao I wonder if you have some time to offer an opinion on this pattern (a way to allow pattern matching but still ensure a struct has to be valudated as part of construction

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think CreateExternalTableFields makes things too complex. It has exactly the same fields as CreateExternalTable, and users will feel confused and need to spend time distinguishing and using them.

In my opinion, it is enough to perform all the validity checks in CreateExternalTableBuilder::build, such as missing required fields and duplicate column names. CreateExternalTables typically come from the SQL planner of DataFusion, where we have already used the builder. For other scenarios, we can encourage users in the documentation to use the Builder API to create CreateExternalTables .

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't block public construction, the downstream projects won't know to migrate to the builder API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The downside of not migrating is a delayed error message, right? Maybe it is ok if the error doesn't immediately happen on construction for all downstream users

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The verification is in the try_new() method used by the builder. If the builder is not used, and this new try_new method isn't used directly either, the verification won't run.
I believe the downside of not migrating may be no error message at all.

name,
schema,
Expand Down Expand Up @@ -365,7 +365,7 @@ impl CreateExternalTable {

/// A struct with same fields as [`CreateExternalTable`] struct so that the DDL can be conveniently
/// destructed with validation that each field is handled, while still requiring that all
/// construction goes through the [`CreateExternalTable::new`] constructor or the builder.
/// construction goes through the [`CreateExternalTable::try_new`] constructor or the builder.
pub struct CreateExternalTableFields {
/// The table name
pub name: TableReference,
Expand Down Expand Up @@ -497,7 +497,7 @@ impl CreateExternalTableBuilder {
}

pub fn build(self) -> Result<CreateExternalTable> {
CreateExternalTable::new(CreateExternalTableFields {
CreateExternalTable::try_new(CreateExternalTableFields {
name: self.name.expect("name is required"),
schema: self.schema.expect("schema is required"),
location: self.location.expect("location is required"),
Expand Down Expand Up @@ -536,7 +536,7 @@ pub struct CreateMemoryTable {
}

impl CreateMemoryTable {
pub fn new(fields: CreateMemoryTableFields) -> Result<Self> {
pub fn try_new(fields: CreateMemoryTableFields) -> Result<Self> {
let CreateMemoryTableFields {
name,
constraints,
Expand Down Expand Up @@ -586,7 +586,7 @@ impl CreateMemoryTable {

/// A struct with same fields as [`CreateMemoryTable`] struct so that the DDL can be conveniently
/// destructed with validation that each field is handled, while still requiring that all
/// construction goes through the [`CreateMemoryTable::new`] constructor or the builder.
/// construction goes through the [`CreateMemoryTable::try_new`] constructor or the builder.
pub struct CreateMemoryTableFields {
/// The table name
pub name: TableReference,
Expand Down Expand Up @@ -664,7 +664,7 @@ impl CreateMemoryTableBuilder {
}

pub fn build(self) -> Result<CreateMemoryTable> {
CreateMemoryTable::new(CreateMemoryTableFields {
CreateMemoryTable::try_new(CreateMemoryTableFields {
name: self.name.expect("name is required"),
constraints: self.constraints,
input: self.input.expect("input is required"),
Expand Down Expand Up @@ -693,7 +693,7 @@ pub struct CreateView {
}

impl CreateView {
pub fn new(fields: CreateViewFields) -> Result<Self> {
pub fn try_new(fields: CreateViewFields) -> Result<Self> {
let CreateViewFields {
name,
input,
Expand Down Expand Up @@ -735,7 +735,7 @@ impl CreateView {

/// A struct with same fields as [`CreateView`] struct so that the DDL can be conveniently
/// destructed with validation that each field is handled, while still requiring that all
/// construction goes through the [`CreateView::new`] constructor or the builder.
/// construction goes through the [`CreateView::try_new`] constructor or the builder.
pub struct CreateViewFields {
/// The table name
pub name: TableReference,
Expand Down Expand Up @@ -795,7 +795,7 @@ impl CreateViewBuilder {
}

pub fn build(self) -> Result<CreateView> {
CreateView::new(CreateViewFields {
CreateView::try_new(CreateViewFields {
name: self.name.expect("name is required"),
input: self.input.expect("input is required"),
or_replace: self.or_replace,
Expand Down
10 changes: 5 additions & 5 deletions datafusion/proto/src/logical_plan/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -568,7 +568,7 @@ impl AsLogicalPlan for LogicalPlanNode {
}

Ok(LogicalPlan::Ddl(DdlStatement::CreateExternalTable(
CreateExternalTable::new(CreateExternalTableFields {
CreateExternalTable::try_new(CreateExternalTableFields {
schema: pb_schema.try_into()?,
name: from_table_reference(
create_extern_table.name.as_ref(),
Expand Down Expand Up @@ -602,8 +602,8 @@ impl AsLogicalPlan for LogicalPlanNode {
None
};

Ok(LogicalPlan::Ddl(DdlStatement::CreateView(CreateView::new(
CreateViewFields {
Ok(LogicalPlan::Ddl(DdlStatement::CreateView(
CreateView::try_new(CreateViewFields {
name: from_table_reference(
create_view.name.as_ref(),
"CreateView",
Expand All @@ -612,8 +612,8 @@ impl AsLogicalPlan for LogicalPlanNode {
input: Arc::new(plan),
or_replace: create_view.or_replace,
definition,
},
)?)))
})?,
)))
}
LogicalPlanType::CreateCatalogSchema(create_catalog_schema) => {
let pb_schema = (create_catalog_schema.schema.clone()).ok_or_else(|| {
Expand Down