feat: Support metadata table "Metadata Log Entries"#846
feat: Support metadata table "Metadata Log Entries"#846rshkv wants to merge 4 commits intoapache:mainfrom
Conversation
bdb6372 to
067ada8
Compare
crates/iceberg/src/metadata_scan.rs
Outdated
| fn snapshot_id_as_of_time( | ||
| table_metadata: &TableMetadataRef, | ||
| timestamp_ms_inclusive: i64, | ||
| ) -> Option<&SnapshotRef> { | ||
| let mut snapshot_id = None; | ||
| // The table metadata snapshot log is chronological | ||
| for log_entry in table_metadata.history() { | ||
| if log_entry.timestamp_ms <= timestamp_ms_inclusive { | ||
| snapshot_id = Some(log_entry.snapshot_id); | ||
| } | ||
| } | ||
| snapshot_id.and_then(|id| table_metadata.snapshot_by_id(id)) | ||
| } |
There was a problem hiding this comment.
This assumes that history() is chronologically ordered because that gets asserted TableMetadata#try_normalize.
067ada8 to
8d76183
Compare
crates/iceberg/src/metadata_scan.rs
Outdated
| // Include the current metadata locaction and modification time in the table. This matches | ||
| // the Java implementation. Unlike the Java implementation, a current metadata location is | ||
| // optional here. In that case, we omit current metadata from the metadata log table. | ||
| if let Some(current_metadata_location) = &scan.metadata_location { | ||
| append_metadata_log_entry( | ||
| scan.metadata_ref.last_updated_ms(), | ||
| current_metadata_location, | ||
| ); | ||
| } |
| .metadata_location(table_metadata1_location.as_os_str().to_str().unwrap()) | ||
| .metadata_location(template_json_location) |
There was a problem hiding this comment.
Before this change, the location in the metadata log and the current location would both be ../metadata/v1.json.
I wanted to have a distinction so we can assert that metadata_log_entries includes the current metadata location, even if not in the metadata log.
|
#822 has been merged, let's move on! |
8d76183 to
395ff1f
Compare
|
Thank you, @Xuanwo. This is rebased and ready for review. |
76c2bf2 to
439c529
Compare
crates/iceberg/src/metadata_table.rs
Outdated
| -- validity: | ||
| -- validity: |
There was a problem hiding this comment.
This was failing for me locally.
| pub mod spec; | ||
|
|
||
| pub mod metadata_scan; | ||
| pub mod metadata_table; |
There was a problem hiding this comment.
Follow-up to this comment: #822 (comment)
Don't need to do in this PR. Up to you.
| /// Creates a metadata table which provides table-like APIs for inspecting metadata. | ||
| /// See [`MetadataTable`] for more details. | ||
| pub fn metadata_table(self) -> MetadataTable { | ||
| pub fn metadata_table(&self) -> MetadataTable<'_> { |
There was a problem hiding this comment.
Follow-up to this comment: #822 (comment)
Also not necessary to do in this PR.
|
Let's get #863 merge first. |
+1. |
Re #823. This adds support for the Metadata Log Entries metadata table.
This is building on @xxchan's unmerged #822. I'll update and rebase this PR when #822 merges.✅Metadata Log Entries is the metadata log with the latest snapshot per metadata file.
Reference implementations: