-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API, Core: Multi-Table transactions #6757
Conversation
230f411
to
118bbaf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome to see @nastra I'll hold off on reviewing the code itself until it's out of draft. I did have more fundamental questions on behavior, especially with multi-table transactions in the presence of branches, and how the isolation level checks are performed.
enum IsolationLevel { | ||
|
||
/** | ||
* Table can change underneath but still have to pass normal table validation. Loading table | ||
* data when it's referenced. | ||
*/ | ||
SNAPSHOT, | ||
|
||
/** | ||
* All tables participating in the transaction must be in the same state when committing | ||
* compared to when the transaction started. Loading table data when it is referenced. | ||
*/ | ||
SERIALIZABLE_WITH_LOADING_READS, | ||
|
||
/** | ||
* All tables participating in the transaction must be in the same state when committing | ||
* compared to when the transaction started. Loading table data as of TX start time. | ||
*/ | ||
SERIALIZABLE_WITH_FIXED_READS, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is still draft, but I wanted to verify for multi-table transactions shouldn't the isolation level enforcement be done at the branch level and not just the table level?
For example, for serializable with fixed reads I would expect the validation that the branch is in the same state even if another branch is written to.
CatalogTransaction catalogTx = catalog.startTransaction(isolationLevel);
Catalog txCatalog = catalogTx.asCatalog();
catalog.loadTable(table1).newFastAppend().appendFile(FILE_B).toBranch("branch1").commit();
catalog.loadTable(table2).newFastAppend().appendFile(FILE_C).toBranch("branch2").commit();
Let's say the append to table 1 above happens concurrently with an UPDATE/DELETE which matches but the update/delete happens on "branch3", then the multi-table transaction should still succeed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should actually be most likely at the branch level (but I'd have to revisit and think about this). However, I started to work on this before writing to branches (#5234) was committed, hence it's not being reflected here.
TableMetadata tableMetadata = | ||
((BaseTable) origin().loadTable(affectedTable)).operations().current(); | ||
if (tableMetadata.lastUpdatedMillis() >= txStartTime().toEpochMilli()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This goes back to my point above, I think checking tableMetadata last updated is too broad when performing the isolation level checks, I think it needs to be performed at the branch level. Unless we're establishing that multi-table transactions can only happen on each table's main branch but that feels too restrictive. Also not sure if we can just do a timestamp check (clock skew between the last updated time set and the start time of the transaction).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above, I started working on this before writing to branches was possible (#5234), hence I didn't consider this in this initial version, simply because I missed that this feature made it into master
@nastra, can you write up a design doc? |
api/src/main/java/org/apache/iceberg/catalog/CatalogTransaction.java
Outdated
Show resolved
Hide resolved
88156cf
to
8b6e944
Compare
8b6e944
to
371ae2f
Compare
Closing this in favor of #6948 |
No description provided.