Skip to content

Attach diagnostic information for duplicate table name error#20720

Open
buraksenn wants to merge 7 commits intoapache:mainfrom
buraksenn:attach-diagnostic-for-duplicate-cte
Open

Attach diagnostic information for duplicate table name error#20720
buraksenn wants to merge 7 commits intoapache:mainfrom
buraksenn:attach-diagnostic-for-duplicate-cte

Conversation

@buraksenn
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

This is part of #14429 epic to add diagnostic to more errors.

What changes are included in this PR?

This PR adds optional span with the CTE and uses those spans to best effort attach diagnostics

Are these changes tested?

  • I've run existing SQL tests to make sure not break any functionality
  • Added new tests for diagnostics

Are there any user-facing changes?

Yes users can call diagnostic methods to get detailed information about errors

@github-actions github-actions bot added the sql SQL Planner label Mar 5, 2026
Copy link
Copy Markdown
Contributor

@kosiew kosiew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@buraksenn

Thanks for the work here.
I think there are a couple of gaps around alias handling that could lead to incorrect behavior or missed diagnostics. Left some detailed comments below.

}
};

let mut alias_spans: HashMap<String, Option<Span>> = HashMap::new();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this currently extracts only the final identifier for unaliased table factors. That means something like catalog1.schema.person and catalog2.schema.person would both be tracked as person and get flagged as duplicates.

However, DataFusion's scan qualifier uses the full normalized TableReference, so these should actually be treated as distinct.

Would it make sense to reuse object_name_to_table_reference(name.clone())?.to_string() here, or otherwise preserve the full relation identity for unaliased tables?

@@ -690,21 +694,75 @@ impl<S: ContextProvider> SqlToRel<'_, S> {
self.plan_table_with_joins(input, planner_context)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the duplicate alias diagnostic is only applied in the multi-entry comma FROM path. A single TableWithJoins still goes straight to plan_table_with_joins, so explicit joins skip this check.

For example, (SELECT 1 AS a) AS t JOIN (SELECT 2 AS b) AS t ON true would not be caught here. Depending on the columns, this could either fall back to a schema error or even plan successfully with duplicate t aliases.

Could we move the alias and span tracking into the relation or join planning path so both comma joins and explicit JOINs go through the same validation and produce consistent diagnostics?

|t: &TableWithJoins| -> Option<(String, Option<Span>)> {
match &t.relation {
TableFactor::Table { alias: Some(a), .. }
| TableFactor::Derived { alias: Some(a), .. }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This extract_table_name closure is doing a few different things at once, like relation name normalization, alias span lookup, and relation kind filtering.

Once the JOIN coverage is addressed, would it be worth pulling this into a small helper near relation planning? It might make the naming rules easier to follow and extend as more table factor variants are added.

@buraksenn
Copy link
Copy Markdown
Contributor Author

Thanks for the detailed review @kosiew. I'll start working on these today

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

sql SQL Planner

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Attach Diagnostic to "duplicate table name" error

3 participants