Skip to content

RESTTableOperations does not support table metadata swap like others TableOperations did #12134

@dramaticlly

Description

@dramaticlly

Apache Iceberg version

1.7.1 (latest release)

Query engine

Spark

Please describe the bug 🐞

Before migrate to REST catalog, we rely on following TableOperations.commit API call to swap table metadata atomically.

String deisredMetadataPath = "/var/newdb/table/metadata/00003-579b23d1-4ca5-4acf-85ec-081e1699cb83.metadata.json""
ops.commit(ops.current(), TableMetadataParser.read(ops.io(), dedeisredMetadataPath));

However this is no longer working in REST based catalog, I suspect it might relate to how update type was modeled here where metadata.changes() return empty when read from parser and end up with empty changeset in update table POST call.

This can be reproduced by adding following test case in org.apache.iceberg.catalog.CatalogTests.java where all other catalogs are functioning as expected but only failure for TestRESTCatalog

Image

repro:

  @Test
  public void testTableOperationCommit() {
    C catalog = catalog();

    if (requiresNamespaceCreate()) {
      catalog.createNamespace(TABLE.namespace());
    }

    Map<String, String> properties =
        ImmutableMap.of("user", "someone", "created-at", "2023-01-15T00:00:01");
    Table originalTable =
        catalog
            .buildTable(TABLE, SCHEMA)
            .withPartitionSpec(SPEC)
            .withSortOrder(WRITE_ORDER)
            .withProperties(properties)
            .create();
    TableOperations ops = ((BaseTable) originalTable).operations();
    String original = ops.current().metadataFileLocation();
    FileIO io = ops.io();

    originalTable.newFastAppend().appendFile(FILE_A).commit();
    originalTable.newFastAppend().appendFile(FILE_B).commit();

    String metadataLocation = ops.refresh().metadataFileLocation();
    System.out.printf("After write, metadata location is:" + metadataLocation);

    ops.commit(ops.refresh(), TableMetadataParser.read(originalTable.io(), original));

    originalTable.refresh();
    TableMetadata actual = ((BaseTable) originalTable).operations().current();
    TableMetadata expected = new StaticTableOperations(original, io).current();

    assertThat(actual.properties())
        .as("Props must match")
        .containsAllEntriesOf(expected.properties());
    assertThat(actual.schema().asStruct())
        .as("Schema must match")
        .isEqualTo(expected.schema().asStruct());
    assertThat(actual.specs()).as("Specs must match").isEqualTo(expected.specs());
    assertThat(actual.sortOrders()).as("Sort orders must match").isEqualTo(expected.sortOrders());
    assertThat(actual.currentSnapshot())
        .as("Current snapshot must match")
        .isEqualTo(expected.currentSnapshot());
    assertThat(actual.snapshots()).as("Snapshots must match").isEqualTo(expected.snapshots());
    assertThat(actual.snapshotLog()).as("History must match").isEqualTo(expected.snapshotLog());

    TestHelpers.assertSameSchemaMap(actual.schemasById(), expected.schemasById());

    assertThat(actual)
        .isEqualToIgnoringGivenFields(
            expected,
            "metadataFileLocation",
            "schemas",
            "specs",
            "sortOrders",
            "properties",
            "schemasById",
            "specsById",
            "sortOrdersById",
            "snapshots");
  }

Willingness to contribute

  • I can contribute a fix for this bug independently
  • I would be willing to contribute a fix for this bug with guidance from the Iceberg community
  • I cannot contribute a fix for this bug at this time

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions