Closed
Description
Environment
Delta-rs version: deltalake==0.12.0
Binding:
Environment:
- Cloud provider:
- OS:
- Other: python
Bug
What happened:
If deltatable schema has attribute with upper case in them, then the merge throws Generic deltatable error stating field not found
What you expected to happen:
The Deltatable merge operation should have been performed.
How to reproduce it:
Sample code:-
from deltalake import DeltaTable, write_deltalake
from datetime import datetime
import polars as pl
df = pl.DataFrame(
{
"Sales_order_id": ["1000", "1001", "1002", "1003"],
"product": ["bike", "scooter", "car", "motorcycle"],
"order_date": [
datetime(2023, 1, 1),
datetime(2023, 1, 5),
datetime(2023, 1, 10),
datetime(2023, 2, 1),
],
"sales_price": [120.25, 2400, 32000, 9000],
"paid_by_customer": [True, False, False, True],
}
)
print(df)
df.write_delta("sales_orders_old", mode="append")
new_data = pl.DataFrame(
{
"Sales_order_id": ["1002", "1004"],
"product": ["car", "car"],
"order_date": [datetime(2023, 1, 10), datetime(2023, 2, 5)],
"sales_price": [30000.0, 40000.0],
"paid_by_customer": [True, True],
}
)
from polars.io.delta import _convert_pa_schema_to_delta
dt = DeltaTable("sales_orders_old")
source = new_data.to_arrow()
delta_schema = _convert_pa_schema_to_delta(source.schema)
source = source.cast(delta_schema)
(
dt.merge(
source=source,
predicate="s.Sales_order_id = t.Sales_order_id",
source_alias="s",
target_alias="t",
)
.when_matched_update_all()
.when_not_matched_insert_all()
.execute()
)
This will throw:- DeltaError: Generic DeltaTable error: Schema error: No field named s.sales_order_id.
More details:
Referenced from this blog post https://delta.io/blog/2023-10-22-delta-rs-python-v0.12.0/
Activity