-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
datafusion-cli is compiled from the latest main commit 5bbdb7e
To reproduce, tpch dataset can be generated via https://github.com/clflushopt/tpchgen-rs/tree/main/tpchgen-cli, and update the following paths
CREATE EXTERNAL TABLE customer
STORED AS PARQUET
LOCATION '/Users/yongting/Code/datafusion-sqlstorm/data/customer.parquet';
CREATE EXTERNAL TABLE nation
STORED AS PARQUET
LOCATION '/Users/yongting/Code/datafusion-sqlstorm/data/nation.parquet';
CREATE EXTERNAL TABLE part
STORED AS PARQUET
LOCATION '/Users/yongting/Code/datafusion-sqlstorm/data/part.parquet';
CREATE EXTERNAL TABLE region
STORED AS PARQUET
LOCATION '/Users/yongting/Code/datafusion-sqlstorm/data/region.parquet';
CREATE EXTERNAL TABLE lineitem
STORED AS PARQUET
LOCATION '/Users/yongting/Code/datafusion-sqlstorm/data/lineitem.parquet';
CREATE EXTERNAL TABLE orders
STORED AS PARQUET
LOCATION '/Users/yongting/Code/datafusion-sqlstorm/data/orders.parquet';
CREATE EXTERNAL TABLE partsupp
STORED AS PARQUET
LOCATION '/Users/yongting/Code/datafusion-sqlstorm/data/partsupp.parquet';
CREATE EXTERNAL TABLE supplier
STORED AS PARQUET
LOCATION '/Users/yongting/Code/datafusion-sqlstorm/data/supplier.parquet';
WITH MonthlySales AS (
SELECT
DATE_TRUNC('month', o_orderdate) AS sales_month,
SUM(l_extendedprice) AS total_sales
FROM orders AS o
JOIN lineitem AS l
ON o.o_orderkey = l.l_orderkey
GROUP BY sales_month
),
TopRegions AS (
SELECT
n.n_name AS region_name,
SUM(ps.ps_supplycost) AS region_supply_cost
FROM partsupp AS ps
JOIN supplier AS s
ON ps.ps_suppkey = s.s_suppkey
JOIN nation AS n
ON s.s_nationkey = n.n_nationkey
GROUP BY region_name
ORDER BY region_supply_cost DESC
LIMIT 5
)
SELECT
ms.sales_month,
ms.total_sales,
tr.region_name,
tr.region_supply_cost
FROM MonthlySales AS ms
LEFT JOIN TopRegions AS tr
ON tr.region_supply_cost > (
SELECT AVG(region_supply_cost) FROM TopRegions
);
Optimizer rule 'eliminate_cross_join' failed
caused by
Check optimizer-specific invariants after optimizer rule: eliminate_cross_join
caused by
Internal error: Failed due to a difference in schemas: original schema: DFSchema { inner: Schema { fields: [Field { name: "sales_month", data_type: Timestamp(Nanosecond, None), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "total_sales", data_type: Decimal128(25, 2), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "region_name", data_type: Utf8View, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "region_supply_cost", data_type: Decimal128(25, 2), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }], metadata: {} }, field_qualifiers: [Some(Bare { table: "ms" }), Some(Bare { table: "ms" }), Some(Bare { table: "tr" }), Some(Bare { table: "tr" })], functional_dependencies: FunctionalDependencies { deps: [FunctionalDependence { source_indices: [0], target_indices: [0, 1], nullable: true, mode: Multi }, FunctionalDependence { source_indices: [2], target_indices: [2, 3], nullable: true, mode: Multi }] } }, new schema: DFSchema { inner: Schema { fields: [Field { name: "sales_month", data_type: Timestamp(Nanosecond, None), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "total_sales", data_type: Decimal128(25, 2), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "region_name", data_type: Utf8View, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "region_supply_cost", data_type: Decimal128(25, 2), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "avg(topregions.region_supply_cost)", data_type: Decimal128(29, 6), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }], metadata: {} }, field_qualifiers: [Some(Bare { table: "ms" }), Some(Bare { table: "ms" }), Some(Bare { table: "tr" }), Some(Bare { table: "tr" }), Some(Bare { table: "tr" })], functional_dependencies: FunctionalDependencies { deps: [FunctionalDependence { source_indices: [0], target_indices: [0, 1], nullable: true, mode: Multi }, FunctionalDependence { source_indices: [2], target_indices: [2, 3], nullable: true, mode: Multi }] } }.
This issue was likely caused by a bug in DataFusion's code. Please help us to resolve this by filing a bug report in our issue tracker: https://github.com/apache/datafusion/issues
To Reproduce
No response
Expected behavior
No response
Additional context
Found by SQLStorm #17698
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working