Skip to content

SELECTing 0 column with EXCLUDE gives projection_push_down error #6510

@mustafasrepo

Description

@mustafasrepo

Describe the bug

Assume we have a table named table1 with single column a such as below

a
1
2
3
4

when I run the query below on this table

SELECT * EXCEPT(a)
FROM table1

It gives Error: Context("push_down_projection", Internal("Optimizer rule 'push_down_projection' failed, due to generate a different schema, original schema: DFSchema { fields: [], metadata: {} }, new schema: DFSchema { fields: [DFField { qualifier: Some(Bare { table: \"table1\" }), field: Field { name: \"a\", data_type: Int32, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} } }], metadata: {} }")) error.

However, assume we have an empty table with no columns named empty_table. Query below works

SELECT *
FROM empty_table

So I don't think bug is related to showing empty columns, but I am not sure.

To Reproduce

You can use test below to reproduce problem

#[tokio::test]
async fn test_exclude_error() -> Result<()> {
    let config = SessionConfig::new()
        .with_target_partitions(1);
    let ctx = SessionContext::with_config(config);
    ctx.sql("CREATE TABLE table1 (
          a INT,
        ) as VALUES
              (1),
              (2),
              (3),
              (4)").await?;

    let sql = "SELECT * EXCEPT(a)
                    FROM table1";

    let msg = format!("Creating logical plan for '{sql}'");
    let dataframe = ctx.sql(sql).await.expect(&msg);
    let physical_plan = dataframe.create_physical_plan().await?;
    let batches = collect(physical_plan, ctx.task_ctx()).await?;
    print_batches(&batches)?;
    Ok(())
}

Test below shows that datafusion has no problem with showing 0 columns.

#[tokio::test]
async fn test_empty_table() -> Result<()> {
    let config = SessionConfig::new()
        .with_target_partitions(1);
    let ctx = SessionContext::with_config(config);
    ctx.sql("CREATE TABLE empty_table ()").await?;

    let sql = "SELECT *
                    FROM empty_table";

    let msg = format!("Creating logical plan for '{sql}'");
    let dataframe = ctx.sql(sql).await.expect(&msg);
    let physical_plan = dataframe.create_physical_plan().await?;
    let batches = collect(physical_plan, ctx.task_ctx()).await?;
    print_batches(&batches)?;
    Ok(())
}

Expected behavior

I expect first query to run successfully

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions