Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support force option on RegisterTable procedure #5327

Closed
wants to merge 3 commits into from

Conversation

yabola
Copy link
Contributor

@yabola yabola commented Jul 21, 2022

fix #5163
Add 'force' option to register existing table, this is optional
usage: CALL mycatalogname.system.register_table('mycatalogname.mydb.mytablename','xx://xxxxx/metadata/xx.metadata.json','force')

@yabola
Copy link
Contributor Author

yabola commented Jul 21, 2022

@flyrain CC~

@flyrain
Copy link
Contributor

flyrain commented Jul 26, 2022

Hi @RussellSpitzer @szehon-ho, could you approve the workflows?

* @return a Table instance
* @throws AlreadyExistsException if the table already exists in the catalog.
*/
Table registerTable(SessionContext context, TableIdentifier ident, String metadataFileLocation);
Table registerTable(SessionContext context, TableIdentifier ident, String metadataFileLocation, boolean force);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a bit confusing for user by reusing the name registerTable for updating the metadata location, even with the force option. Also updating metadata location is a risky operation, we need to separate it from a safe operation(register a table), so that user won't accidentally do that. Suggested names would be resetTable or resetMetadataLocation.

Copy link
Contributor Author

@yabola yabola Jul 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@flyrain Do I need to introduce a new Spark procedure method, or just reuse the registerTable procedure?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a new procedure will be better. Would like to hear from others. @szehon-ho @RussellSpitzer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea I was initially thinking a separate one will be more clear .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I liked it being just an additional parameter, but I think I'm outvoted here :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on @szehon-ho 's suggestion below perhaps this should be "resetTable" based on the forReset comments below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

* @return a Table instance
* @throws AlreadyExistsException if the table already exists in the catalog.
*/
default Table registerTable(TableIdentifier identifier, String metadataFileLocation) {
default Table registerTable(TableIdentifier identifier, String metadataFileLocation, boolean force) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same concern here.

@yabola
Copy link
Contributor Author

yabola commented Jul 26, 2022

Recently merged new code and I need to resolve the conflict first

throw new org.apache.iceberg.exceptions.AlreadyExistsException("Table already exists: %s", identifier);
}
HiveTableOperations ops = (HiveTableOperations) newTableOps(identifier);
TableMetadata base = ops.current();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to get the base? Passing the null should be fine, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If pass in base, we can change the commit method less, otherwise, the logic of commit checking meta will be incorrect. commit method means changing table meta from base to newMeta. I think this is good for maintaining the existing logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that the changes here will have the least impact on the overall method, and no additional logic will be added. The connection between methods is also self-consistent

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we should go through the operation.commit pathway here. Feels like what we are doing here is different than every other kind of commit. Really all we want to do is transactionally swap one parameter for another. I'm gonna think more about this tomorrow

String newMetadataLocation = base == null && metadata.metadataFileLocation() != null ?
metadata.metadataFileLocation() : writeNewMetadata(metadata, currentVersion() + 1);
boolean isTableForRegister = metadata.isForRegister();
String newMetadataLocation = isTableForRegister ? metadata.metadataFileLocation()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check if metadata.metadataFileLocation() is null?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will not be null, only registerTable will take this value, the registerTable method has ensured that it is not null

* @return a Table instance
* @throws AlreadyExistsException if the table already exists in the catalog.
*/
Table registerTable(SessionContext context, TableIdentifier ident, String metadataFileLocation);
Table registerTable(SessionContext context, TableIdentifier ident, String metadataFileLocation, boolean force);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea I was initially thinking a separate one will be more clear .

@@ -215,8 +215,9 @@ protected void doRefresh() {
@SuppressWarnings("checkstyle:CyclomaticComplexity")
@Override
protected void doCommit(TableMetadata base, TableMetadata metadata) {
String newMetadataLocation = base == null && metadata.metadataFileLocation() != null ?
metadata.metadataFileLocation() : writeNewMetadata(metadata, currentVersion() + 1);
boolean isTableForRegister = metadata.isForRegister();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably be called 'forReset()' or something more descriptive. This is a bit ugly but couldn't think of a good way to avoid it. I suppose just having a metadata with a location != null , regardless of whether base is null or not , is not enough to signal its a reset? @aokolnychyi @RussellSpitzer @rdblue for thoughts?

@@ -215,8 +215,9 @@ protected void doRefresh() {
@SuppressWarnings("checkstyle:CyclomaticComplexity")
@Override
protected void doCommit(TableMetadata base, TableMetadata metadata) {
String newMetadataLocation = base == null && metadata.metadataFileLocation() != null ?
metadata.metadataFileLocation() : writeNewMetadata(metadata, currentVersion() + 1);
boolean isTableForRegister = metadata.isForRegister();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also seems like we should implement this for all the catalogs, or at least error there if this path is chosen.

@@ -235,19 +235,21 @@ public void renameTable(TableIdentifier from, TableIdentifier originalTo) {
}

@Override
public org.apache.iceberg.Table registerTable(TableIdentifier identifier, String metadataFileLocation) {
public org.apache.iceberg.Table registerTable(TableIdentifier identifier,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please rebase.

#5037 has moved register table to BaseMetastoreCatalog

@yabola
Copy link
Contributor Author

yabola commented Jul 29, 2022

@flyrain @RussellSpitzer @szehon-ho @ajantha-bhat I had separated the resetTable API and do not reuse the commit method (from @RussellSpitzer Feels like what we are doing here is different than every other kind of commit.)
Do you think it is more clear? If it looks good to you, I will finish some other catalog resetTable API.

Comment on lines 353 to 355
try {
lockId = Optional.of(acquireLock());
Table tbl = loadHmsTable();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not like this. Too big part of the commit code is duplicated, and this is especially true for the error handling. This part is complicated and error prone, so we might want to try to find a better way to wave the to parts of the code together.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to abstract this part of the locking logic, let me think about it~

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pvary I abstract the part of the metadata locking logic. Please have a look.

@yabola yabola changed the title Support force option on registerTable procedure Support resetTable procedure Jul 30, 2022
* @param metadataFileLocation the location of a metadata file
* @return a Table instance
*/
default Table resetTable(TableIdentifier identifier, String metadataFileLocation) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to add another method to Catalog? What is the difference between this and dropTable followed by registerTable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But It may have a time gap, when there are consecutive queries. And it doesn't seem guaranteed to be a transactional operation

@rdblue
Copy link
Contributor

rdblue commented Jul 31, 2022

I think we should be very careful about introducing new methods to the Catalog API. There needs to be a strong justification for adding to it, and I don't think merging this PR is a good idea.

@yabola
Copy link
Contributor Author

yabola commented Aug 1, 2022

I think we should be very careful about introducing new methods to the Catalog API. There needs to be a strong justification for adding to it, and I don't think merging this PR is a good idea.

How about reuse registerTable procedure?

@rdblue
Copy link
Contributor

rdblue commented Aug 1, 2022

How about reuse registerTable procedure?

If you mean to add more arguments to registerTable so that it is an atomic operation, I don't really see much point to it. If this is a recovery operation for cases like the ones @szehon-ho listed in the issues, then why does it need to be atomic? The table is already broken.

We added registerTable as a way to migrate to a new metastore. At the time, there were concerns about people misusing it to fork tables and do other unreliable things. I still have that concern for this operation. Reusing registerTable as-is seems like the best way to avoid those problems.

@flyrain
Copy link
Contributor

flyrain commented Aug 2, 2022

My concern for dropping table is that a catalog like HMS usually has hooks to react on events like a table is dropped. We don't really want to drop a table here, but it may trigger something unexpected.

@RussellSpitzer
Copy link
Member

I prefer this approach (I think I said so before :) )

Yes fine you win :). Yea the only kind of downside is if we error re-creating the table, the table is dropped as the two are not atomic. But register can just be called again in that case. Agree its not worth the complexity to implement atomicity for a small gain.

I thought we could just switch the "create" to an "alter" and keep atomicity but I didn't look too much into that.

@yabola
Copy link
Contributor Author

yabola commented Aug 6, 2022

@szehon-ho as the discussion above, please take a look on my latest change. This modification can minimally invade the commit logic. And the original registerTable means table from null to a new table. New logic (drop then create) is also in line with the original logic.

@yabola yabola changed the title Support resetTable procedure Support force option on RegisterTable procedure Aug 6, 2022
@rdblue
Copy link
Contributor

rdblue commented Aug 7, 2022

@szehon-ho, what I originally meant was that it would be fine for the caller to manually call dropTable first. I'm hesitant to add a force option because it's basically the same thing as a reset, we just hide it a bit.

Thinking about the problem a bit more, I'm not sure that the use cases we're trying to satisfy make this a catalog operation. If this is about replacing the metadata for a single existing table, why not make this a table operation? Or better yet, why not use the existing table operation to do this? If the table exists, then you can always call table.operations().commit(table.operations().current(), replacementMetadata).

@RussellSpitzer
Copy link
Member

@rdblue The issue with those solutions for our use case is that they only work if the current metadata is valid. If not, we cannot drop the table via the Catalog api (requires loading the metadata first) which has led me to constantly showing folks how to access the hive via the Spark external client wrapper. The same issue occurs if we want to make this a table operation, we can't perform an operation on a table which doesn't have a valid metadata.json.

Our issue is that in Spark we can't Drop the table or point to a new metadata without using an api for the metastore directly. Ideally I think the Iceberg catalog implementation should provide an exit valve for these situations so users aren't required to manually connect to the underlying catalog system themselves.

I am open to other solutions here but I'm not very worried about folks overusing this api, it tends to only come up when the user is already in a really bad state.

@yabola
Copy link
Contributor Author

yabola commented Aug 8, 2022

But call table.operations().commit(table.operations().current(), replacementMetadata) can't update the metadata location (There is some check logic in this method that prevents changes).
I think this PR provides a command line way to accomplish this function.

@rdblue
Copy link
Contributor

rdblue commented Aug 8, 2022

@RussellSpitzer, I see what you mean. What about fixing the Hive catalog so that the commit succeeds if table metadata can't be loaded? It sounds like we should have better handling for broken tables.

@RussellSpitzer
Copy link
Member

@RussellSpitzer, I see what you mean. What about fixing the Hive catalog so that the commit succeeds if table metadata can't be loaded? It sounds like we should have better handling for broken tables.

That would be ok for me as well, I'm really just looking for a blessed way of fixing these states. I think allowing for dropping an non-iceberg or iceberg table with invalid metadata is probably ok too but that does seem a little scary to me than having an explicit register command which we know isn't part of normal operations.

@szehon-ho
Copy link
Collaborator

What about fixing the Hive catalog so that the commit succeeds if table metadata can't be loaded? It sounds like we should have better handling for broken tables.

Are we suggesting change dropTable to continue if table metadata cannot be loaded? (I assume we wont add any force flag?). That sounds like a great feature, its one of the pain points using Iceberg at least on Hive catalog, dropping a tables where metadata doesnt exist, would love to see a pr on that.

It's not the complete solution to the issue I initially filed (rollback corrupt metadata) though, and agree it still would be nice if there's a blessed way somehow

@rdblue
Copy link
Contributor

rdblue commented Aug 8, 2022

@szehon-ho, in part, yes. I think that if we have situations where a table is corrupted then we should handle those in a better way than just having a table that requires manual intervention. Making DROP work for those tables sounds like a good idea to me. Maybe we could also update registerTable to overwrite a corrupt Iceberg table using similar logic. Then we wouldn't need a flag.

@flyrain
Copy link
Contributor

flyrain commented Aug 8, 2022

+1 to detect if the table is corrupt first, then register the table with the new metadata file location. No flag is needed.

@szehon-ho
Copy link
Collaborator

Sounds good. The only corruption detection we can tell now is if the metadata.json no longer exists, I think, unless there are other things we can easily detect.

My initial thought for this pr was actually for non-corrupt case (ie, I have a metadata json that loads fine, but the metadata is wrong, eg issues listed in #5163). But I guess the consensus is , the user just writes a script to drop and register the Iceberg table with the right metadata json in that case, and we don't want to extend Iceberg Catalog at this point to handle those cases.

@yabola sorry about that, but I guess if you want to instead make a patch for the issue (make HiveCatalog.dropTable able to handle non-exist metadata json).

@yabola
Copy link
Contributor Author

yabola commented Aug 9, 2022

@szehon-ho Sure, I will make another PR for drop corrupt table. Thanks for all your discussions !

@GabeChurch
Copy link

GabeChurch commented Mar 31, 2023

I'm going to bump this as well.

It is potentially extremely helpful in cases where users want to "extend-read-only" s3-access to an iceberg table across environments, but have a separate hive metastores. For instance -- say users have a sandbox environment where they want to make a table from a production environment accessible that doesn't depend on atomicity, extending read-only access would save a HUGE expense in replication.

With the existing procedure the only option is to force drop the table before re-registering a new metafile, which can break other things attempting to access the table. It would be much better to maintain a lock on the table until the metadata has changed (drop+register at API level) so that other queries do not fail if they attempt to query while new metadata is being written via the force option mentioned.

If we can't add a "force" option to the "iceberg.system.register_table" we should add a new "non-storage-level-invasive" method to update metadata without requiring end users to drop a table via another system process -- maybe something like "iceberg.system.override_registered_table"

@rushilshah1
Copy link
Contributor

I am also interested in this functionality as well! Are there still plans to add this?

@abfisher0417
Copy link

I am also interested in this capability. I have a use case where I create/update Iceberg tables in an isolated cloud environment separate from consumers/readers of those tables. Consumers would periodically sync all metadata/data files from the producing cloud storage bucket to their own cloud storage bucket; consumers would also have their own catalog(s) that point to the underlying table metadata/data files.

The RegisterTable procedure today would allow partners in this scenario to add tables to their own catalog. However, there's not a "clean" way to point an existing table to a newer version of the same table without first dropping the table the re-adding it.

@yabola
Copy link
Contributor Author

yabola commented Jan 23, 2024

@abfisher0417 Thank you. If community agrees, I can complete this PR again.

Copy link

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support catalog method to set table metadata
10 participants