Description
Apache Iceberg version
1.2.1 (latest release)
Query engine
Spark
Please describe the bug 🐞
Make sure to set the number of partitions to one, to avoid having files with a single record (merge on read will just discard the file when all the rows are marked as deleted).
SET spark.sql.shuffle.partitions=1;
Create a table:
create table default.t1
(
foo string,
bar int,
dt timestamp
) TBLPROPERTIES (
'format-version'='2',
'write.delete.mode'='merge-on-read',
'write.update.mode'='merge-on-read',
'write.merge.mode'='merge-on-read'
)
PARTITIONED BY (days(dt));
Insert some data:
INSERT INTO default.t1 VALUES
('a', 1, now()),
('b', 2, now()),
('c', 3, now());
SELECT returns:
foo | bar | dt |
---|---|---|
a | 1 | 2023-04-28 13:51:28.522796 |
b | 2 | 2023-04-28 13:51:28.522817 |
c | 3 | 2023-04-28 13:51:28.522837 |
Delete one row:
DELETE FROM default.t1 WHERE foo = 'b'
Look at the files:
SELECT * FROM default.t1.files
We see one data file, and one delete file:
content | file_path | file_format | spec_id | partition | record_count | file_size_in_bytes | column_sizes | value_counts | null_value_counts | nan_value_counts | lower_bounds | upper_bounds | key_metadata | split_offsets | equality_ids | sort_order_id | readable_metrics |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | s3://warehouse/default/t1/data/dt_day=2023-04-28/00000-2424-8d75df2c-4639-49a4-82e6-1772365a9355-00001.parquet | PARQUET | 0 | Row(dt_day=datetime.date(2023, 4, 28)) | 3 | 897 | {1: 54, 2: 52, 3: 56} | {1: 3, 2: 3, 3: 3} | {1: 0, 2: 0, 3: 0} | {} | {1: bytearray(b'a'), 2: bytearray(b'\x01\x00\x00\x00'), 3: bytearray(b',\x92\x12\xc5e\xfa\x05\x00')} | {1: bytearray(b'c'), 2: bytearray(b'\x03\x00\x00\x00'), 3: bytearray(b'U\x92\x12\xc5e\xfa\x05\x00')} | None | [4] | None | 0 | Row(bar=Row(column_size=52, value_count=3, null_value_count=0, nan_value_count=None, lower_bound=1, upper_bound=3), dt=Row(column_size=56, value_count=3, null_value_count=0, nan_value_count=None, lower_bound=datetime.datetime(2023, 4, 28, 13, 51, 28, 522796), upper_bound=datetime.datetime(2023, 4, 28, 13, 51, 28, 522837)), foo=Row(column_size=54, value_count=3, null_value_count=0, nan_value_count=None, lower_bound='a', upper_bound='c')) |
1 | s3://warehouse/default/t1/data/dt_day=2023-04-28/00000-2428-3743fe19-d841-4d1c-b940-bd73692b29c7-00001-deletes.parquet | PARQUET | 0 | Row(dt_day=datetime.date(2023, 4, 28)) | 1 | 1490 | {2147483546: 142, 2147483545: 46} | {2147483546: 1, 2147483545: 1} | {2147483546: 0, 2147483545: 0} | {} | {2147483546: bytearray(b's3://warehouse/default/t1/data/dt_day=2023-04-28/00000-2424-8d75df2c-4639-49a4-82e6-1772365a9355-00001.parquet'), 2147483545: bytearray(b'\x01\x00\x00\x00\x00\x00\x00\x00')} | {2147483546: bytearray(b's3://warehouse/default/t1/data/dt_day=2023-04-28/00000-2424-8d75df2c-4639-49a4-82e6-1772365a9355-00001.parquet'), 2147483545: bytearray(b'\x01\x00\x00\x00\x00\x00\x00\x00')} | None | None | None | None | Row(bar=Row(column_size=None, value_count=None, null_value_count=None, nan_value_count=None, lower_bound=None, upper_bound=None), dt=Row(column_size=None, value_count=None, null_value_count=None, nan_value_count=None, lower_bound=None, upper_bound=None), foo=Row(column_size=None, value_count=None, null_value_count=None, nan_value_count=None, lower_bound=None, upper_bound=None)) |
I don't like delete files, they slow down my queries, so let's clean them up:
CALL system.rewrite_data_files(table => 'default.t1', options => map (
'delete-file-threshold', '1'
))
Let's look again at the files:
SELECT * FROM default.t1.files
This is incorrect, we still get the delete file, but the data file has been updated (we can see in the value count that there are two rows):
content | file_path | file_format | spec_id | partition | record_count | file_size_in_bytes | column_sizes | value_counts | null_value_counts | nan_value_counts | lower_bounds | upper_bounds | key_metadata | split_offsets | equality_ids | sort_order_id | readable_metrics |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | s3://warehouse/default/t1/data/dt_day=2023-04-28/00000-2433-3108a2a5-b3f3-4c16-849b-4c54604584f5-00001.parquet | PARQUET | 0 | Row(dt_day=datetime.date(2023, 4, 28)) | 2 | 909 | {1: 55, 2: 53, 3: 60} | {1: 2, 2: 2, 3: 2} | {1: 0, 2: 0, 3: 0} | {} | {1: bytearray(b'a'), 2: bytearray(b'\x01\x00\x00\x00'), 3: bytearray(b',\x92\x12\xc5e\xfa\x05\x00')} | {1: bytearray(b'c'), 2: bytearray(b'\x03\x00\x00\x00'), 3: bytearray(b'U\x92\x12\xc5e\xfa\x05\x00')} | None | [4] | None | 0 | Row(bar=Row(column_size=53, value_count=2, null_value_count=0, nan_value_count=None, lower_bound=1, upper_bound=3), dt=Row(column_size=60, value_count=2, null_value_count=0, nan_value_count=None, lower_bound=datetime.datetime(2023, 4, 28, 13, 51, 28, 522796), upper_bound=datetime.datetime(2023, 4, 28, 13, 51, 28, 522837)), foo=Row(column_size=55, value_count=2, null_value_count=0, nan_value_count=None, lower_bound='a', upper_bound='c')) |
1 | s3://warehouse/default/t1/data/dt_day=2023-04-28/00000-2428-3743fe19-d841-4d1c-b940-bd73692b29c7-00001-deletes.parquet | PARQUET | 0 | Row(dt_day=datetime.date(2023, 4, 28)) | 1 | 1490 | {2147483546: 142, 2147483545: 46} | {2147483546: 1, 2147483545: 1} | {2147483546: 0, 2147483545: 0} | {} | {2147483546: bytearray(b's3://warehouse/default/t1/data/dt_day=2023-04-28/00000-2424-8d75df2c-4639-49a4-82e6-1772365a9355-00001.parquet'), 2147483545: bytearray(b'\x01\x00\x00\x00\x00\x00\x00\x00')} | {2147483546: bytearray(b's3://warehouse/default/t1/data/dt_day=2023-04-28/00000-2424-8d75df2c-4639-49a4-82e6-1772365a9355-00001.parquet'), 2147483545: bytearray(b'\x01\x00\x00\x00\x00\x00\x00\x00')} | None | None | None | None | Row(bar=Row(column_size=None, value_count=None, null_value_count=None, nan_value_count=None, lower_bound=None, upper_bound=None), dt=Row(column_size=None, value_count=None, null_value_count=None, nan_value_count=None, lower_bound=None, upper_bound=None), foo=Row(column_size=None, value_count=None, null_value_count=None, nan_value_count=None, lower_bound=None, upper_bound=None)) |
Double checking, it still looks good:
SELECT * FROM default.t1
foo | bar | dt |
---|---|---|
a | 1 | 2023-04-28 13:51:28.522796 |
c | 3 | 2023-04-28 13:51:28.522837 |
Also in the snapshots:
SELECT * FROM default.t1.snapshots
The summary still mentions 3 rows:
committed_at | snapshot_id | parent_id | operation | manifest_list | summary |
---|---|---|---|---|---|
2023-04-28 13:51:28.654000 | 1830440247885882398 | None | append | s3://warehouse/default/t1/metadata/snap-1830440247885882398-1-e17ec3b6-01a0-49e8-8be4-fffca9dbdaef.avro | {'spark.app.id': 'local-1682689536619', 'changed-partition-count': '1', 'added-data-files': '1', 'total-equality-deletes': '0', 'added-records': '3', 'total-position-deletes': '0', 'added-files-size': '897', 'total-delete-files': '0', 'total-files-size': '897', 'total-records': '3', 'total-data-files': '1'} |
2023-04-28 13:51:34.738000 | 1705447835275510977 | 1830440247885882398 | overwrite | s3://warehouse/default/t1/metadata/snap-1705447835275510977-1-44a6bbb7-a1fc-4e6b-bed7-5f8a578a2a11.avro | {'spark.app.id': 'local-1682689536619', 'changed-partition-count': '1', 'added-position-deletes': '1', 'total-equality-deletes': '0', 'total-position-deletes': '1', 'added-position-delete-files': '1', 'added-files-size': '1490', 'total-delete-files': '1', 'added-delete-files': '1', 'total-files-size': '2387', 'total-records': '3', 'total-data-files': '1'} |
2023-04-28 15:06:04.304000 | 1992524767636422274 | 1705447835275510977 | replace | s3://warehouse/default/t1/metadata/snap-1992524767636422274-1-8c696509-2fee-4dd8-ae4c-1729da3bfe17.avro | {'added-data-files': '1', 'total-equality-deletes': '0', 'added-records': '2', 'deleted-data-files': '1', 'deleted-records': '3', 'total-records': '2', 'removed-files-size': '897', 'changed-partition-count': '1', 'total-position-deletes': '1', 'added-files-size': '909', 'total-delete-files': '1', 'total-files-size': '2399', 'total-data-files': '1'} |
I thought, maybe we need to rewrite the manifests:
CALL system.rewrite_manifests('default.t1')
rewritten_manifests_count | added_manifests_count |
---|---|
2 | 1 |
SELECT * FROM default.t1.snapshots
Adds a new snapshot, but still 3 rows:
committed_at | snapshot_id | parent_id | operation | manifest_list | summary |
---|---|---|---|---|---|
2023-04-28 13:51:28.654000 | 1830440247885882398 | None | append | s3://warehouse/default/t1/metadata/snap-1830440247885882398-1-e17ec3b6-01a0-49e8-8be4-fffca9dbdaef.avro | {'spark.app.id': 'local-1682689536619', 'changed-partition-count': '1', 'added-data-files': '1', 'total-equality-deletes': '0', 'added-records': '3', 'total-position-deletes': '0', 'added-files-size': '897', 'total-delete-files': '0', 'total-files-size': '897', 'total-records': '3', 'total-data-files': '1'} |
2023-04-28 13:51:34.738000 | 1705447835275510977 | 1830440247885882398 | overwrite | s3://warehouse/default/t1/metadata/snap-1705447835275510977-1-44a6bbb7-a1fc-4e6b-bed7-5f8a578a2a11.avro | {'spark.app.id': 'local-1682689536619', 'changed-partition-count': '1', 'added-position-deletes': '1', 'total-equality-deletes': '0', 'total-position-deletes': '1', 'added-position-delete-files': '1', 'added-files-size': '1490', 'total-delete-files': '1', 'added-delete-files': '1', 'total-files-size': '2387', 'total-records': '3', 'total-data-files': '1'} |
2023-04-28 15:06:04.304000 | 1992524767636422274 | 1705447835275510977 | replace | s3://warehouse/default/t1/metadata/snap-1992524767636422274-1-8c696509-2fee-4dd8-ae4c-1729da3bfe17.avro | {'added-data-files': '1', 'total-equality-deletes': '0', 'added-records': '2', 'deleted-data-files': '1', 'deleted-records': '3', 'total-records': '2', 'removed-files-size': '897', 'changed-partition-count': '1', 'total-position-deletes': '1', 'added-files-size': '909', 'total-delete-files': '1', 'total-files-size': '2399', 'total-data-files': '1'} |
2023-04-28 18:01:58.283000 | 5057249889609572796 | 1992524767636422274 | replace | s3://warehouse/default/t1/metadata/snap-5057249889609572796-1-31f5ecc8-a81f-40c6-8a9d-205a52f1c64c.avro | {'changed-partition-count': '0', 'total-equality-deletes': '0', 'manifests-created': '1', 'total-position-deletes': '1', 'total-delete-files': '1', 'manifests-kept': '0', 'manifests-replaced': '2', 'entries-processed': '0', 'total-files-size': '2399', 'total-records': '2', 'total-data-files': '1'} |
Also, the files are still incorrect:
SELECT * FROM default.t1.files
content | file_path | file_format | spec_id | partition | record_count | file_size_in_bytes | column_sizes | value_counts | null_value_counts | nan_value_counts | lower_bounds | upper_bounds | key_metadata | split_offsets | equality_ids | sort_order_id | readable_metrics |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | s3://warehouse/default/t1/data/dt_day=2023-04-28/00000-2433-3108a2a5-b3f3-4c16-849b-4c54604584f5-00001.parquet | PARQUET | 0 | Row(dt_day=datetime.date(2023, 4, 28)) | 2 | 909 | {1: 55, 2: 53, 3: 60} | {1: 2, 2: 2, 3: 2} | {1: 0, 2: 0, 3: 0} | {} | {1: bytearray(b'a'), 2: bytearray(b'\x01\x00\x00\x00'), 3: bytearray(b',\x92\x12\xc5e\xfa\x05\x00')} | {1: bytearray(b'c'), 2: bytearray(b'\x03\x00\x00\x00'), 3: bytearray(b'U\x92\x12\xc5e\xfa\x05\x00')} | None | [4] | None | 0 | Row(bar=Row(column_size=53, value_count=2, null_value_count=0, nan_value_count=None, lower_bound=1, upper_bound=3), dt=Row(column_size=60, value_count=2, null_value_count=0, nan_value_count=None, lower_bound=datetime.datetime(2023, 4, 28, 13, 51, 28, 522796), upper_bound=datetime.datetime(2023, 4, 28, 13, 51, 28, 522837)), foo=Row(column_size=55, value_count=2, null_value_count=0, nan_value_count=None, lower_bound='a', upper_bound='c')) |
1 | s3://warehouse/default/t1/data/dt_day=2023-04-28/00000-2428-3743fe19-d841-4d1c-b940-bd73692b29c7-00001-deletes.parquet | PARQUET | 0 | Row(dt_day=datetime.date(2023, 4, 28)) | 1 | 1490 | {2147483546: 142, 2147483545: 46} | {2147483546: 1, 2147483545: 1} | {2147483546: 0, 2147483545: 0} | {} | {2147483546: bytearray(b's3://warehouse/default/t1/data/dt_day=2023-04-28/00000-2424-8d75df2c-4639-49a4-82e6-1772365a9355-00001.parquet'), 2147483545: bytearray(b'\x01\x00\x00\x00\x00\x00\x00\x00')} | {2147483546: bytearray(b's3://warehouse/default/t1/data/dt_day=2023-04-28/00000-2424-8d75df2c-4639-49a4-82e6-1772365a9355-00001.parquet'), 2147483545: bytearray(b'\x01\x00\x00\x00\x00\x00\x00\x00')} | None | None | None | None | Row(bar=Row(column_size=None, value_count=None, null_value_count=None, nan_value_count=None, lower_bound=None, upper_bound=None), dt=Row(column_size=None, value_count=None, null_value_count=None, nan_value_count=None, lower_bound=None, upper_bound=None), foo=Row(column_size=None, value_count=None, null_value_count=None, nan_value_count=None, lower_bound=None, upper_bound=None)) |
This looks like an inconsistency, can anyone confirm?