-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor commit lock mechanism from HiveTableOperations #5877
Conversation
Change-Id: I32ef3305c9df091ad1ccc994a542ab9b54d89cfa
|
||
try { | ||
if (state.get().equals(LockState.WAITING)) { | ||
// Retry count is the typical "upper bound of retries" for Tasks.run() function. In fact, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: could you please reformat the comment?
import org.slf4j.Logger; | ||
import org.slf4j.LoggerFactory; | ||
|
||
public class HiveCommitLock { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be useful to implement the standard java.util.concurrent.locks.Lock
interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do, we may also make it clear in the doc that unlike standard Lock, this is not multi thread safe? (ie, it is not safe to call acquire lock from multiple threads) If my observation is right, I guess its worth documenting in any case
@szlta: any particular change to highlight for the review? Also, please check the failing tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think its good to refactor it, but definitely need to spend a bit of time to review this critical code. Gave a pass
"Failed to heartbeat for hive lock. %s", | ||
hiveLockHeartbeat.encounteredException.getMessage()); | ||
if (!commitLock.isHeartbeatInProgress()) { | ||
throw new CommitFailedException("Failed to heartbeat for hive lock."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel here we may be losing some information in the stack? in the case that encounteredException is not null?
&& hiveLockHeartbeat.encounteredException == null; | ||
} | ||
|
||
private void acquireLockFromHms() throws UnknownHostException, TException, InterruptedException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: should method be named acquireHmsLock to be consistent with release?
LOG.warn("Failed to unlock {}.{}", databaseName, tableName, e); | ||
} | ||
} | ||
if (hiveLockHeartbeat != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In original code, cancelling seemed to be run before unlock, should we keep it that way?
import org.slf4j.Logger; | ||
import org.slf4j.LoggerFactory; | ||
|
||
public class HiveCommitLock { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do, we may also make it clear in the doc that unlike standard Lock, this is not multi thread safe? (ie, it is not safe to call acquire lock from multiple threads) If my observation is right, I guess its worth documenting in any case
private final ClientPool<IMetaStoreClient, TException> metaClients; | ||
private final ScheduledExecutorService exitingScheduledExecutorService; | ||
|
||
private Optional<Long> hmsLockId = Optional.empty(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering, is there justification to make these Optional versus just null checks?
There are some inteillij warnings because we are using this. Looking into it, there are some who say that Java Optional is not meant for a general purpose Maybe type: https://stackoverflow.com/questions/26327957/should-java-8-getters-return-optional-type/26328555#26328555 hence the intellij warnings. It seems its meant more for return types of methods to indicate that its possible something is null. Of course its more of a debate, but just noticing we dont use a whole lot of Optional elsewhere and may be worth to keep it simple.
I do realize that originally lockId is optional, but mentioning because we are adding another optional (jvmLock). Wdyt?
} | ||
|
||
private void acquireJvmLock() { | ||
if (jvmLock.isPresent()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Im just wondering, as we check both (jvmLock and lockId) for null, it seems it will protect only the case where single caller calls acquire twice? It wont protect from multi thread calls as these are only optional and not atomic, and method acquire is not synchronized. Would a simple boolean suffice then to avoid two null checks?
"HMS lock ID=%s already acquired for table %s.%s", | ||
hmsLockId.get(), databaseName, tableName)); | ||
} | ||
final LockComponent lockComponent = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do realize its copy and paste, but I know from @aokolnychyi 's comments , we dont tend to use final in inline variable as there's little signfiicance. Could we remove them here?
"Waiting for lock on table %s.%s", databaseName, tableName)); | ||
} | ||
} catch (InterruptedException e) { | ||
Thread.interrupted(); // Clear the interrupt status flag |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again I see its copy and paste, but its weird, it seems this is checking a boolean and ignoring the value. Could you check this?
} | ||
|
||
private void releaseHmsLock() { | ||
if (hmsLockId.isPresent()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have a simple boolean acquired (as mentioned in my other comment), we can just replace these theese two null checks with a single check on release(). As these are both private methods, checking both seems excessive?
This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions. |
This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time. |
HiveTableOperations
has grown to a substantial size and got pretty complex.A large portion of the logic is the implementation of the locking mechanism which I think could be separated out to reduce this complexity.
Let me know what you think @pvary @szehon-ho