You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[yugabyte#19855] DocDB: Handle blocking shutdown issue on drop table due to active write requests
Summary:
As part of commit yugabyte@4d360c7, we changed `TabletPeer::GetRaftConsensus` to return `IllegalState` on Shutdown instead of `NotFound`. This caused a regression in org.yb.cql.TestIndex.testDropDuringWrite
As part of AssembleDocWriteBatch, the stuck write query requests the status of a transaction and sees ABORTED state. It then tries to wait for the returned coodinator safe time, giving enough window for actually committed transactions to apply at this participant. `TransactionParticipant::WaitForSafeTime` eventually calls `Tablet::DoGetSafeTime` which tries to access `TabletPeer::GetRaftConsensus()`. But since the shutdown request comes in and sets the flags in the meanwhile, the tablet peer now returns `IllegalState` instead of `NotFound` (prior to the above quoted commit). Earlier, this `NotFound` was being streamed back. But post the above commit, we were instead getting into a state where we execute `mvcc_.SafeTimeForFollower`, which ends up blocking until the request deadline.
```
Result<HybridTime> Tablet::DoGetSafeTime(...) {
...
if (require_lease == RequireLease::kFallbackToFollower && ht_lease_result.status().IsIllegalState()) {
return CheckSafeTime(mvcc_.SafeTimeForFollower(min_allowed, deadline), min_allowed);
}
...
}
```
The above snippet was introduced in yugabyte#7729 and is required for correctness.
This diff addresses the regression by returning a retryable error at the participant when in shutdown state. Since the `WriteQuery` would eventually not be processed due to already shut down consensus, we could early fail the `WaitForSafeTime` request with a retryable error status.
Jira: DB-8799
Test Plan:
Jenkins
./yb_build.sh fastdebug --java-test org.yb.cql.TestIndex#testDropDuringWrite -n 20 --tp 1
Reviewers: sergei, arybochkin, rsami
Reviewed By: arybochkin, rsami
Subscribers: bogdan, ybase
Differential Revision: https://phorge.dev.yugabyte.com/D31617
0 commit comments