Iceberg Connector

# TODOs for the Iceberg Connector

- [x] Update the README to reflect the current status, and convert it to proper connector documentation before announcing the connector as ready for use (https://github.com/trinodb/trino/pull/4537, https://github.com/trinodb/trino/pull/5887)
- [ ] Lower case all field names read from Iceberg metadata files.
- [x] Fix table listing to skip non-Iceberg tables. This will need a new metastore method to list tables filtered on a property name, similar to how view listing works in `ThriftHiveMetastore`.
- [x] Predicate pushdown is currently broken, which means delete is also broken. The code from the original `getTableLayouts()` implementation needs to be updated for `applyFilter()`.
- [x] Delete is broken and should be fixed. Note that unlike Hive Connector, Iceberg Connector should support row-by-row deletion.
- [x] All of the `HdfsContext` calls that use `/tmp` need to be fixed.
- [x] `HiveConfig` needs to be removed. We might need to split out separate config classes in the Hive connector for the components that are reused in Iceberg.
- [x] We should try to remove `HiveColumnHandle`. This will require replacing or abstracting `HivePageSource`, which is currently used to handle schema evolution and prefilled column values (identity partitions).
- [x] Writing of decimals and timestamps is broken, since their representation in Parquet seems to be different for Iceberg and Hive. Reads are probably also broken, but this isn't tested yet since writes don't work. We will need a native Parquet writer to fix this.
- [x] ~UUID type is not implemented and will be dropped from the Iceberg specification.~
- [x] https://github.com/trinodb/trino/issues/6663
- [x] Implement time type.
- [x] Partition table
- [x] History table
- [x] Snapshots table
- [x] Manifests table 
- [x] Files table
- [x] Return table statistics so CBO can leverage them.
- [x] Add implementation and tests for table comments.
- [x] Add implementation and tests for column comments.
- [x] Needs complete tests for all data types and all partitioning transforms.
- [x] Needs integration tests (probably as product tests) for interoperability with Spark in both directions (write Spark -> read Presto, write Presto -> read Spark).
- [x] Needs correctness tests for partition pruning. (also validate the pushdown is happening by checking the query plans?) https://github.com/prestosql/presto/issues/2660
- [x] Add tests for `CREATE TABLE LIKE`.
- [x] Add test for creating `NOT NULL` columns.
- [x] Add tests for non-Iceberg tables: listing tables in a schema, listing columns in a schema, describing a table, selecting from a table) https://github.com/trinodb/trino/issues/5459
- [x] Add product tests: https://github.com/prestosql/presto/issues/2304
- [x] https://github.com/trinodb/trino/issues/13196
- [x] https://github.com/trinodb/trino/issues/9843
- [x] Add procedure for *rollback table to snapshot*.
- [x] ORC support https://github.com/prestosql/presto/pull/2042
- [x] https://github.com/trinodb/trino/pull/12125
- [ ] https://github.com/prestosql/presto/issues/2298
- [x] `NOT NULL` enforcement
- [x] `location` or `external_location` table property https://github.com/prestosql/presto/issues/2501
- [x] Use metastore locking around read-modify-write operations for transaction commit https://github.com/trinodb/trino/issues/9583
- [ ] Iceberg commit retries #9582
- [x] Add tests for truncate on numeric types #5456
- [ ] Add tests for partition transforms on structured types #5458
- [x] Add tests for Hive tables in the same metastore #5459
- [x]  Dereference Pushdown for Iceberg Connector #5179
- [x]  Flaky test TestIcebergCreateTable.testCreateTable #4864
- [x] Add support for partition evolution #7580
  - [x] Trino cannot read an Iceberg table that has dropped a partition field #8284
- [x] Test bucketing consistency and stability, like Hive's `TestHiveBucketing`.
- [x] Support predicate pushdown and metadata deletion for non-partition columns #7905
- [x] Run Iceberg product tests with all tested Hive distributions #7898
- [ ] Improve test coverage around partitioned tables and $partition system table https://github.com/trinodb/trino/issues/7972
- [x] Add support for Trino views in Iceberg connector #8540
- [x] https://github.com/trinodb/trino/issues/8623
- [ ] Fix reading of specific Iceberg snapshots #8663
- [ ] #8690
- [x] https://github.com/trinodb/trino/issues/8693
- [x] Support use-preferred-write-partitioning for the Iceberg connector #8682
- [ ] Improve performance of Iceberg decimal bucket transform #8724
- [x] Unexpected results when reading Iceberg Parquet table after nested field schema evolved #8750
- [ ] Evaluate Apache Iceberg's support for predicate on structural types #8759
- [x] Add $file hidden column in Iceberg connector #8769
- [ ] Populate split_offsets in Iceberg data files #9018
- [x] Iceberg partition pruning does not work for predicates not expressible by tuple domain #9309
- [x] https://github.com/trinodb/trino/issues/4115
- [x] Support Glue metastore in Iceberg connector #9363
- [x] Excessive metastore invocations when querying Iceberg table #8675
- [x] Reject Hive configuration properties that have no meaning for Iceberg #9607
- [ ] IcebergSplitSource: Support large IN predicates #9743
- [x] Incorrect query results for Iceberg table partitioned on varbinary / binary #9755
- [x] Revamp Iceberg statistics reporting #9716
- [x] SHOW STATS fails with NPE when Iceberg file has no columns with stats #9714
- [x] SHOW STATS fails if Iceberg metadata has no statistics for a file #9707 
- [x] Incorrect values returned when Iceberg table partitioned by timestamp with time zone #9704
- [x] Query failure when reading from $partitions when Iceberg table partitioned on timestamp with time zone #9703
- [x] Garbage return value from Iceberg $partitions for varbinary non-partition column #9756
- [ ] IcebergSplitSource throws away CombinedScanTask combinations #8486
- [x] #9810
- [ ] https://github.com/trinodb/trino/issues/9852
- [x] https://github.com/trinodb/trino/issues/9953
- [x] https://github.com/trinodb/trino/issues/9968
- [x] #8919
- [x] #10058
- [x] https://github.com/trinodb/trino/issues/9791
- [x] https://github.com/trinodb/trino/pull/10173
- [ ] https://github.com/trinodb/trino/issues/10245
- [x] https://github.com/trinodb/trino/issues/10758 
- [x] https://github.com/trinodb/trino/issues/10786 
- [ ] https://github.com/trinodb/trino/issues/11000
- [x] https://github.com/trinodb/trino/issues/12138
- [x] https://github.com/trinodb/trino/pull/12026
- [x] https://github.com/trinodb/trino/issues/12362
- [x] https://github.com/trinodb/trino/pull/10258
- [x] https://github.com/trinodb/trino/issues/12743
- [x] https://github.com/trinodb/trino/issues/12785
- [x] https://github.com/trinodb/trino/issues/5632
- [x] https://github.com/trinodb/trino/issues/13413

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Iceberg Connector #1324

TODOs for the Iceberg Connector

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Iceberg Connector #1324

Description

TODOs for the Iceberg Connector

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions