-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Phase 1][colocation] Check failed: Table not found in Raft group #10100
Comments
Hi @iswarezwp What version are you using ? |
I think maybe the problem is that, RaftConsensus::UpdateReplica batch prepare then batch write the ops in one Update request: Result<RaftConsensus::UpdateReplicaResult> RaftConsensus::UpdateReplica(
ConsensusRequestPB* request, ConsensusResponsePB* response) {
......
// 2 - Enqueue the prepares
if (!VERIFY_RESULT(EnqueuePreparesUnlocked(*request, &deduped_req, response))) {
return UpdateReplicaResult();
}
if (deduped_req.committed_op_id.index < prev_committed_op_id.index) {
deduped_req.committed_op_id = prev_committed_op_id;
}
// 3 - Enqueue the writes.
auto last_from_leader = EnqueueWritesUnlocked(
deduped_req, WriteEmpty(prev_committed_op_id != deduped_req.committed_op_id));
...... In the prepare phase, the add_table ops is not yet persistent, so the next ops with a @ddorian This problem is found in a Yugabyte 2.4.5 environment, but maybe all version is affected. |
logs may help:
|
@iswarezwp Is this related to #6096 ? the fix for that landed later in 2.6/2.7 versions. If not, can you repro this in your version? Do you have a set of repro steps that you are using? |
@bmatican This problem happened while handle raft update request, not on local WAL replay. |
Got something similar when running TestPgRegressGin CREATE INDEX on TABLEGROUP: https://gist.github.com/jaki/d5d2bf15965329a20ac92084afe16c45/raw/f7f496a1c8dd850dcf87f0b1d67803ddb15b8c92/issue10100.TestPgRegressGin.log.
This is centos-release-gcc with unrelated changes off recent master commit b472191. |
@jaki I ran |
Fixed by a5455f3 |
Jira Link: [DB-371](https://yugabyte.atlassian.net/browse/DB-371)
For colocated database, the tablet leader may accidentally send the following update request to the follower which lost some data, the follower currently did not handle
add_table
request,cause aCheck failed
error:this is the stack:
The text was updated successfully, but these errors were encountered: