Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core: Metadata table queries fail if a partition column was reused in V2 #4662

Merged
merged 6 commits into from
Jun 22, 2022

Conversation

szlta
Copy link
Contributor

@szlta szlta commented Apr 28, 2022

First go at fixing #4661.
The proposed solution described at #3411 (comment) seems very hardly feasible, so this would be a quick fix to just avoid Invalid schema errors thrown in any metadata query where the a partition col was reused.

@szehon-ho
Copy link
Collaborator

szehon-ho commented Apr 30, 2022

I still need to ramp a bit on this, but I thought in that discussion we are going to collapse all of the same partition fields (if some are re-used) and they are uniquely identified with specid? Ie, just "data" and "id".

example for reference:

PartitionSpec initialSpec = PartitionSpec.builderFor(SCHEMA)
    .identity("data")
    .build();
TestTables.TestTable table = TestTables.create(tableDir, "test", SCHEMA, initialSpec, V2_FORMAT_VERSION);

table.updateSpec()
    .removeField("data")
    .commit();

table.updateSpec()
    .addField("data")
    .addField("id")
    .commit();

struct<1000: data: optional string, 1001: data: optional string, 1002: id: optional int>

I am wondering, does the code try to re-use old partition values in both metadata (ManifestEvaluator) and data filtering? I thought it does, but need to check, let me know if assumption is incorrect. Example here, we would hopefully use "data" partition values written by the table's initial spec, for queries after re-adding "data" to the spec. So then it may make sense to collapse the fields together in the partitions/metadata table? Welcome any thoughts from @rdblue , @aokolnychyi who have more background in the initial discussion.

@szlta
Copy link
Contributor Author

szlta commented May 2, 2022

Hey @szehon-ho this was just intended to be a quick fix so we don't throw exceptions for the case you mentioned too.
Currently if we want to collapse the matching partition columns we will hit the following issue:

  • suppose we'd have 1000: data, 1001: data, 1002: id in the combined partition type
  • Partitioning.partitionType could collapse this into 1001: data, 1002: id (leaving out the last seen "data" as going back in specs)
  • this will be the "read schema" when reading the manifest files (as the underlying avro files)
  • for spec0, id matching will think that although the file schema has 1000: data, but the read schema doesn't, this reading will just return null for 1001: data and null for 1002: id leaving us no info the collapse upon..

Perhaps if we could rework the re-addition of "data" column to spec so that no new field ID is generated for it? I'm not sure if this will cause other issues though..

@szehon-ho
Copy link
Collaborator

Yea I understand, my gist of the earlier conversation was that the most optimal solution is to try to re-use the field-id if its the equivalent transformation as a partition field before? I think the thing to check would be whether the regular metadata filtering is affected or not (today it seems to work, via logic in Projection filter being able to resolve it)

@szlta szlta force-pushed the fixPartColReuseV2 branch from de72beb to adc57aa Compare May 11, 2022 14:19
@szlta
Copy link
Contributor Author

szlta commented May 12, 2022

@szehon-ho I think we were probably talking about the same thing for generating no new ID for a PartitionField that was already seen. I have amended my change based on this. I looked into this question in some detail and have found that this is actually kind of already happening when we are removing part of the PartitionFields:

  • we start with
    spec0: 1000: identity(a), 1001: identity(b)
  • then if update the spec so that only b remains, a doesn't get reassigned a new fieldID, although we do create a new spec:
    spec1: 1000: identity(a)
  • now if we re-add b and c, then b would get assigned ID 1002 and c would get ID 1003
  • I propose we look through in the past specs to see if b as partition field already participated (based on transform function and source IDs) and recycle the old ID, so we end up with:
    spec2: 1000: identity(a), 1001: identity(b), 1002: identity(c)

Looking at the tests this should work, there's only one failure I get now in TestMetadataTableScans.testFilesTableScanWithDroppedPartition:

Expected :struct<1000: data_bucket: optional int, 1001: data_bucket_16: optional int, 1002: data_trunc_2: optional string>
Actual   :struct<1000: data_bucket_16: optional int, 1001: data_trunc_2: optional string>

This made me think though.. it seems like we have two different ways of generating the partition field column name:
https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/BaseUpdatePartitionSpec.java#L460 generates data_bucket_16, while
https://github.com/apache/iceberg/blob/master/api/src/main/java/org/apache/iceberg/PartitionSpec.java#L484 generates data_bucket. Does anyone know why this discrepancy exists? My gut feeling is that the former would be the correct way.

Copy link
Collaborator

@szehon-ho szehon-ho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks like it's the right direction for me.

Is there any way to validate the data scan is correct filtering out the right files in this scenario (when we remove and add back partition filter), maybe by adding a test in "TestDataTableScan"?

In regards to your question in TestMetadataTableScans.testFilesTableScanWithDroppedPartition, yes I agree the former seems the correct name to me and seems we should try to align it, but I don't really have much context and maybe @aokolnychyi , @rdblue or @RussellSpitzer have more thoughts about that.

Copy link
Collaborator

@szehon-ho szehon-ho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks better to me, just some minor comments and some test failures.

@szlta
Copy link
Contributor Author

szlta commented May 20, 2022

@aokolnychyi , @rdblue or @RussellSpitzer could you please share your opinion on #4662 (review) aka on the discrepancy in generating PartitionField names?

szlta added 6 commits June 10, 2022 16:08
… V2 - the collapse method

Change-Id: I91808def7cea2005e47b57d3033298afef26ca32
Change-Id: Ibba90e196e43fbde998d95ad02785f341c77f7ee
Change-Id: If25577df7655a13e21f4d6dd983b1b6c41601a2a
Change-Id: Ib70d0cefbce629e07d670a2f7876b5f638d87963
…leScans

Change-Id: I9b3e0cc02b728986607481de3b9bda1d7513b2c3
@szlta szlta force-pushed the fixPartColReuseV2 branch from 5fd6ad8 to 073edd0 Compare June 10, 2022 15:02
@pvary
Copy link
Contributor

pvary commented Jun 15, 2022

Since the decision on #4868 is to stick with the old naming convention of default partition names, I think this is the best solution here. Please call out if you disagree, otherwise I will merge this soon.

Thanks,
Peter

@pvary pvary merged commit a5efb53 into apache:master Jun 22, 2022
@pvary
Copy link
Contributor

pvary commented Jun 22, 2022

Thanks for the fix @szlta and the @szehon-ho and @rdblue for the review!

namrathamyske pushed a commit to namrathamyske/iceberg that referenced this pull request Jul 10, 2022
namrathamyske pushed a commit to namrathamyske/iceberg that referenced this pull request Jul 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants