Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datafusion-proto deserialize with q16 sql fails #3820

Closed
r4ntix opened this issue Oct 13, 2022 · 2 comments · Fixed by #4050
Closed

datafusion-proto deserialize with q16 sql fails #3820

r4ntix opened this issue Oct 13, 2022 · 2 comments · Fixed by #4050
Assignees
Labels
bug Something isn't working
Milestone

Comments

@r4ntix
Copy link
Contributor

r4ntix commented Oct 13, 2022

Describe the bug
Serialize and Deserialize using q16 in benchmarks failed:

Error: SchemaError(FieldNotFound { qualifier: Some("part"), name: "p_brand", valid_fields: Some(["p_brand", "p_type", "p_size", "COUNT(DISTINCT partsupp.ps_suppkey)"]) })

This problem was found in arrow-ballista's benchmarks test: apache/datafusion-ballista#330

To Reproduce

let plan = ctx
        .sql(
            "
select
    p_brand,
    p_type,
    p_size,
    count(distinct ps_suppkey) as supplier_cnt
from
    partsupp,
    part
where
  p_partkey = ps_partkey
  and p_brand <> 'Brand#45'
  and p_type not like 'MEDIUM POLISHED%'
  and p_size in (49, 14, 23, 45, 19, 3, 36, 9)
  and ps_suppkey not in (
    select
        s_suppkey
    from
        supplier
    where
            s_comment like '%Customer%Complaints%'
)
group by
    p_brand,
    p_type,
    p_size
order by
    supplier_cnt desc,
    p_brand,
    p_type,
    p_size;
    ",
        )
        .await?
        .to_logical_plan()?;
    let bytes = logical_plan_to_bytes(&plan)?;
    let logical_round_trip = logical_plan_from_bytes(&bytes, &ctx)?;
    assert_eq!(format!("{:?}", plan), format!("{:?}", logical_round_trip));

this code logical_plan_from_bytes(&bytes, &ctx)? fails.

Expected behavior

Additional context

@andygrove
Copy link
Member

The issue is in this part of the plan:

    Projection: group_alias_0 AS p_brand, group_alias_1 AS p_type, group_alias_2 AS p_size, COUNT(alias1) AS COUNT(DISTINCT partsupp.ps_suppkey)
      Aggregate: groupBy=[[group_alias_0, group_alias_1, group_alias_2]], aggr=[[COUNT(alias1)]]
        Aggregate: groupBy=[[part.p_brand AS group_alias_0, part.p_type AS group_alias_1, part.p_size AS group_alias_2, partsupp.ps_suppkey AS alias1]], aggr=[[]]

We have part.p_brand AS group_alias_0 and later group_alias_0 AS p_brand, dropping the qualifier.

@andygrove
Copy link
Member

root cause is #4049

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants