Description
Environment
Delta-rs version: 0.16.1
Binding:
Environment:
- Cloud provider: AWS
- OS: Linux
- Other:
Bug
What happened: When creating checkpoints for a table that has delta.minReaderVersion < 3
, protocol.readerFeatures
key (with empty array value) is created in the beginning of the checkpoint parquet file.
Example:
{"metaData": null, "protocol": {"minReaderVersion": 1, "minWriterVersion": 2, "writerFeatures": [], "readerFeatures": []}, "txn": null, "add": null, "remove": null}
{"metaData": {"id": "dfba8203-b89f-4501-9f77-7d0d6d597e50", "name": null, "description": null, "schemaString": "{\"type\":\"struct\",\"fields\":[...
...
This causes certain query engines (in this case, Trino) fail with readerFeatures must not exist when minReaderVersion is less than 3
.
Example:
select count(*) from test_fua_delta_lake_uncompacted_4;
Query 20240329_000911_00000_q83vs failed: readerFeatures must not exist when minReaderVersion is less than 3
Since readerFeatures
require minReaderVersion 3
and writerFeatures
require minWriterVersion 7
, these two keys (protocol.readerFeatures
and protocol.writerFeatures
) should not be added when delta.minReaderVersion < 3
and delta.minWriterVersion < 7
, respectively.
What you expected to happen: protocol.readerFeatures
and protocol.writerFeatures
should not be added to the top of the checkpoint parquet files when delta.minReaderVersion < 3
and delta.minWriterVersion < 7
, respectively.
How to reproduce it: Create a table in Trino and then commit & create checkpoint from delta-rs to that table. Then do a SELECT COUNT(*) query against the table in Trino.
More details:
Activity