-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support force option on RegisterTable procedure #5327
Conversation
@flyrain CC~ |
Hi @RussellSpitzer @szehon-ho, could you approve the workflows? |
* @return a Table instance | ||
* @throws AlreadyExistsException if the table already exists in the catalog. | ||
*/ | ||
Table registerTable(SessionContext context, TableIdentifier ident, String metadataFileLocation); | ||
Table registerTable(SessionContext context, TableIdentifier ident, String metadataFileLocation, boolean force); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a bit confusing for user by reusing the name registerTable
for updating the metadata location, even with the force
option. Also updating metadata location
is a risky operation, we need to separate it from a safe operation(register a table), so that user won't accidentally do that. Suggested names would be resetTable
or resetMetadataLocation
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@flyrain Do I need to introduce a new Spark procedure method, or just reuse the registerTable procedure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a new procedure will be better. Would like to hear from others. @szehon-ho @RussellSpitzer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea I was initially thinking a separate one will be more clear .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I liked it being just an additional parameter, but I think I'm outvoted here :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on @szehon-ho 's suggestion below perhaps this should be "resetTable" based on the forReset
comments below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
* @return a Table instance | ||
* @throws AlreadyExistsException if the table already exists in the catalog. | ||
*/ | ||
default Table registerTable(TableIdentifier identifier, String metadataFileLocation) { | ||
default Table registerTable(TableIdentifier identifier, String metadataFileLocation, boolean force) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same concern here.
Recently merged new code and I need to resolve the conflict first |
throw new org.apache.iceberg.exceptions.AlreadyExistsException("Table already exists: %s", identifier); | ||
} | ||
HiveTableOperations ops = (HiveTableOperations) newTableOps(identifier); | ||
TableMetadata base = ops.current(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to get the base
? Passing the null should be fine, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If pass in base, we can change the commit
method less, otherwise, the logic of commit checking meta will be incorrect. commit
method means changing table meta from base to newMeta. I think this is good for maintaining the existing logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel that the changes here will have the least impact on the overall method, and no additional logic will be added. The connection between methods is also self-consistent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we should go through the operation.commit pathway here. Feels like what we are doing here is different than every other kind of commit. Really all we want to do is transactionally swap one parameter for another. I'm gonna think more about this tomorrow
String newMetadataLocation = base == null && metadata.metadataFileLocation() != null ? | ||
metadata.metadataFileLocation() : writeNewMetadata(metadata, currentVersion() + 1); | ||
boolean isTableForRegister = metadata.isForRegister(); | ||
String newMetadataLocation = isTableForRegister ? metadata.metadataFileLocation() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to check if metadata.metadataFileLocation()
is null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will not be null, only registerTable
will take this value, the registerTable
method has ensured that it is not null
* @return a Table instance | ||
* @throws AlreadyExistsException if the table already exists in the catalog. | ||
*/ | ||
Table registerTable(SessionContext context, TableIdentifier ident, String metadataFileLocation); | ||
Table registerTable(SessionContext context, TableIdentifier ident, String metadataFileLocation, boolean force); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea I was initially thinking a separate one will be more clear .
@@ -215,8 +215,9 @@ protected void doRefresh() { | |||
@SuppressWarnings("checkstyle:CyclomaticComplexity") | |||
@Override | |||
protected void doCommit(TableMetadata base, TableMetadata metadata) { | |||
String newMetadataLocation = base == null && metadata.metadataFileLocation() != null ? | |||
metadata.metadataFileLocation() : writeNewMetadata(metadata, currentVersion() + 1); | |||
boolean isTableForRegister = metadata.isForRegister(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably be called 'forReset()' or something more descriptive. This is a bit ugly but couldn't think of a good way to avoid it. I suppose just having a metadata with a location != null , regardless of whether base is null or not , is not enough to signal its a reset? @aokolnychyi @RussellSpitzer @rdblue for thoughts?
@@ -215,8 +215,9 @@ protected void doRefresh() { | |||
@SuppressWarnings("checkstyle:CyclomaticComplexity") | |||
@Override | |||
protected void doCommit(TableMetadata base, TableMetadata metadata) { | |||
String newMetadataLocation = base == null && metadata.metadataFileLocation() != null ? | |||
metadata.metadataFileLocation() : writeNewMetadata(metadata, currentVersion() + 1); | |||
boolean isTableForRegister = metadata.isForRegister(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also seems like we should implement this for all the catalogs, or at least error there if this path is chosen.
@@ -235,19 +235,21 @@ public void renameTable(TableIdentifier from, TableIdentifier originalTo) { | |||
} | |||
|
|||
@Override | |||
public org.apache.iceberg.Table registerTable(TableIdentifier identifier, String metadataFileLocation) { | |||
public org.apache.iceberg.Table registerTable(TableIdentifier identifier, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please rebase.
#5037 has moved register table to BaseMetastoreCatalog
@flyrain @RussellSpitzer @szehon-ho @ajantha-bhat I had separated the resetTable API and do not reuse the |
try { | ||
lockId = Optional.of(acquireLock()); | ||
Table tbl = loadHmsTable(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not like this. Too big part of the commit code is duplicated, and this is especially true for the error handling. This part is complicated and error prone, so we might want to try to find a better way to wave the to parts of the code together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to abstract this part of the locking logic, let me think about it~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pvary I abstract the part of the metadata locking logic. Please have a look.
* @param metadataFileLocation the location of a metadata file | ||
* @return a Table instance | ||
*/ | ||
default Table resetTable(TableIdentifier identifier, String metadataFileLocation) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to add another method to Catalog
? What is the difference between this and dropTable
followed by registerTable
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But It may have a time gap, when there are consecutive queries. And it doesn't seem guaranteed to be a transactional operation
I think we should be very careful about introducing new methods to the |
How about reuse |
If you mean to add more arguments to We added |
My concern for dropping table is that a catalog like HMS usually has hooks to react on events like a table is dropped. We don't really want to drop a table here, but it may trigger something unexpected. |
I thought we could just switch the "create" to an "alter" and keep atomicity but I didn't look too much into that. |
@szehon-ho as the discussion above, please take a look on my latest change. This modification can minimally invade the |
@szehon-ho, what I originally meant was that it would be fine for the caller to manually call Thinking about the problem a bit more, I'm not sure that the use cases we're trying to satisfy make this a catalog operation. If this is about replacing the metadata for a single existing table, why not make this a table operation? Or better yet, why not use the existing table operation to do this? If the table exists, then you can always call |
@rdblue The issue with those solutions for our use case is that they only work if the current metadata is valid. If not, we cannot drop the table via the Catalog api (requires loading the metadata first) which has led me to constantly showing folks how to access the hive via the Spark external client wrapper. The same issue occurs if we want to make this a table operation, we can't perform an operation on a table which doesn't have a valid metadata.json. Our issue is that in Spark we can't Drop the table or point to a new metadata without using an api for the metastore directly. Ideally I think the Iceberg catalog implementation should provide an exit valve for these situations so users aren't required to manually connect to the underlying catalog system themselves. I am open to other solutions here but I'm not very worried about folks overusing this api, it tends to only come up when the user is already in a really bad state. |
But call |
@RussellSpitzer, I see what you mean. What about fixing the Hive catalog so that the commit succeeds if table metadata can't be loaded? It sounds like we should have better handling for broken tables. |
That would be ok for me as well, I'm really just looking for a blessed way of fixing these states. I think allowing for dropping an non-iceberg or iceberg table with invalid metadata is probably ok too but that does seem a little scary to me than having an explicit register command which we know isn't part of normal operations. |
Are we suggesting change dropTable to continue if table metadata cannot be loaded? (I assume we wont add any force flag?). That sounds like a great feature, its one of the pain points using Iceberg at least on Hive catalog, dropping a tables where metadata doesnt exist, would love to see a pr on that. It's not the complete solution to the issue I initially filed (rollback corrupt metadata) though, and agree it still would be nice if there's a blessed way somehow |
@szehon-ho, in part, yes. I think that if we have situations where a table is corrupted then we should handle those in a better way than just having a table that requires manual intervention. Making DROP work for those tables sounds like a good idea to me. Maybe we could also update |
+1 to detect if the table is corrupt first, then register the table with the new metadata file location. No flag is needed. |
Sounds good. The only corruption detection we can tell now is if the metadata.json no longer exists, I think, unless there are other things we can easily detect. My initial thought for this pr was actually for non-corrupt case (ie, I have a metadata json that loads fine, but the metadata is wrong, eg issues listed in #5163). But I guess the consensus is , the user just writes a script to drop and register the Iceberg table with the right metadata json in that case, and we don't want to extend Iceberg Catalog at this point to handle those cases. @yabola sorry about that, but I guess if you want to instead make a patch for the issue (make HiveCatalog.dropTable able to handle non-exist metadata json). |
@szehon-ho Sure, I will make another PR for drop corrupt table. Thanks for all your discussions ! |
I'm going to bump this as well. It is potentially extremely helpful in cases where users want to "extend-read-only" s3-access to an iceberg table across environments, but have a separate hive metastores. For instance -- say users have a sandbox environment where they want to make a table from a production environment accessible that doesn't depend on atomicity, extending read-only access would save a HUGE expense in replication. With the existing procedure the only option is to force drop the table before re-registering a new metafile, which can break other things attempting to access the table. It would be much better to maintain a lock on the table until the metadata has changed (drop+register at API level) so that other queries do not fail if they attempt to query while new metadata is being written via the force option mentioned. If we can't add a "force" option to the "iceberg.system.register_table" we should add a new "non-storage-level-invasive" method to update metadata without requiring end users to drop a table via another system process -- maybe something like "iceberg.system.override_registered_table" |
I am also interested in this functionality as well! Are there still plans to add this? |
I am also interested in this capability. I have a use case where I create/update Iceberg tables in an isolated cloud environment separate from consumers/readers of those tables. Consumers would periodically sync all metadata/data files from the producing cloud storage bucket to their own cloud storage bucket; consumers would also have their own catalog(s) that point to the underlying table metadata/data files. The RegisterTable procedure today would allow partners in this scenario to add tables to their own catalog. However, there's not a "clean" way to point an existing table to a newer version of the same table without first dropping the table the re-adding it. |
@abfisher0417 Thank you. If community agrees, I can complete this PR again. |
This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions. |
fix #5163
Add 'force' option to register existing table, this is optional
usage:
CALL mycatalogname.system.register_table('mycatalogname.mydb.mytablename','xx://xxxxx/metadata/xx.metadata.json','force')