Skip to content

Commit ac2af53

Browse files
[BACKPORT 2.18][#20178, #20041] Docdb: Handle deleted index better during index backfill
Summary: Original commit: 0c7dd15 / D30865 Our stress tests commonly delete the index even before the index is backfilled. The error message logged is confusing and not very helpful. Simplifying the returned error code/message. Previously: ``` E1117 04:24:12.662847 80635 backfill_index.cc:986] Backfill Index Table(s) { test_indexes_034c74byvaluev4_idx4 } failed to backfill the index: [e718ede41a574e94bb855c1e0a321325] due to Invalid argument (yb/tserver/tablet_service.cc:735): Tablet has a different schema 595 vs 591. Requested index is not ready to backfill. IndexMap: 0x000035913cdb1e58 -> [{955035f2fa10421c9d9d379303bd95c7, table_id: "955035f2fa10421c9d9d379303bd95c7" version: 0 is_local: false columns { column_id: 0 indexed_column_id: 2 column_name: "C$_v2" colexpr { column_id: 2 } } . . . . columns { column_id: 1 indexed_column_id: 0 column_name: "C$_k" colexpr { column_id: 0 } } hash_column_count: 1 range_column_count: 1 is_unique: false indexed_table_id: "68214fdb2bd2484095f782959aca7482" indexed_hash_column_ids: 0 use_mangled_column_name: true index_permissions: INDEX_PERM_READ_WRITE_AND_DELETE backfill_error_message: "" num_rows_processed_by_backfill_job: 1461853}] }} ``` After the change: ``` [m-1] W1213 17:37:46.736081 1898082304 backfill_index.cc:1582] TS 0127865cf9154dd8aace4611aaefe502: backfill failed for tablet 6b2b0b9c6bd942feb971ed47ddbf310e (table test_table [id=9eef1eda4e9d4d44bab7b2142a4abff9]) no further retry: Invalid argument (yb/tserver/tablet_ service.cc:716): Index 95f1a0ca84eb4bd9b922f9edc41d5525 not found in index_map. Current schema is 19 response was error { code: OPERATION_NOT_SUPPORTED status { code: INVALID_ARGUMENT message: "Index 95f1a0ca84eb4bd9b922f9edc41d5525 not found in index_map. Current schema is 19" source_file: "../../src/yb/tserver/tablet_service.cc" source_line: 716 errors: "\000" } } failed_index_ids: "95f1a0ca84eb4bd9b922f9edc41d5525" [m-1] I1213 17:37:46.736519 1898082304 backfill_index.cc:1331] Failed to backfill the tablet 0x00000001248da800 -> 6b2b0b9c6bd942feb971ed47ddbf310e (table test_table [id=9eef1eda4e9d4d44bab7b2142a4abff9]): Invalid argument (yb/tserver/tablet_service.cc:716): Index 95f1a0 ca84eb4bd9b922f9edc41d5525 not found in index_map. Current schema is 19 ``` Additionally, prior to this revision we don't seem to be populating `failed_indexes` with the id of the missing index. This causes the whole batch of indexes to be marked as "failed". We ensure that failed_indexes is populated correctly, so that only the index which is not found in the IndexMap is marked as failed and the remaining indexes backfill to success. Jira: DB-9124, DB-9003 Test Plan: ybd --cxx-test cassandra_cpp_driver-test --gtest_filter CppCassandraDriverTest.DeleteIndexWhileBackfilling Reviewers: rthallam, jason, arybochkin Reviewed By: jason Subscribers: ybase, bogdan Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D31591
1 parent 5ffafda commit ac2af53

File tree

2 files changed

+14
-5
lines changed

2 files changed

+14
-5
lines changed

src/yb/integration-tests/cassandra_cpp_driver-test.cc

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2007,6 +2007,7 @@ TEST_F_EX(
20072007
ASSERT_OK(table.CreateTable(&session_, "test.test_table", {"k", "v"}, {"(k)"}, true));
20082008

20092009
LOG(INFO) << "Creating two indexes that will backfill together";
2010+
ASSERT_OK(cluster_->SetFlagOnMasters("TEST_block_do_backfill", "true"));
20102011
// Create 2 indexes that backfill together. One of them will be deleted while the backfill
20112012
// is happening. The deleted index should be successfully deleted, and the other index will
20122013
// be successfully backfilled.
@@ -2029,6 +2030,7 @@ TEST_F_EX(
20292030
ASSERT_OK(session_.ExecuteQuery("drop index test_table_index_by_v1"));
20302031

20312032
// Wait for the backfill to actually run to completion/failure.
2033+
ASSERT_OK(cluster_->SetFlagOnMasters("TEST_block_do_backfill", "false"));
20322034
SleepFor(MonoDelta::FromSeconds(10));
20332035
res = client_->WaitUntilIndexPermissionsAtLeast(
20342036
table_name, index_table_name1, IndexPermissions::INDEX_PERM_NOT_USED, 50ms /* max_wait */);

src/yb/tserver/tablet_service.cc

Lines changed: 12 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -638,6 +638,8 @@ void TabletServiceAdminImpl::BackfillIndex(
638638
return;
639639
}
640640

641+
const uint32_t our_schema_version = tablet.peer->tablet_metadata()->schema_version();
642+
const uint32_t their_schema_version = req->schema_version();
641643
bool all_at_backfill = true;
642644
bool all_past_backfill = true;
643645
bool is_pg_table = tablet.tablet->table_type() == TableType::PGSQL_TABLE_TYPE;
@@ -667,9 +669,16 @@ void TabletServiceAdminImpl::BackfillIndex(
667669
all_past_backfill &=
668670
idx_info_pb.index_permissions() > IndexPermissions::INDEX_PERM_DO_BACKFILL;
669671
} else {
670-
LOG(WARNING) << "index " << idx.table_id() << " not found in tablet metadata";
671-
all_at_backfill = false;
672-
all_past_backfill = false;
672+
const auto& index_table_id = idx.table_id();
673+
LOG(INFO) << "index " << index_table_id << " not found in tablet metadata";
674+
*resp->add_failed_index_ids() = index_table_id;
675+
SetupErrorAndRespond(
676+
resp->mutable_error(),
677+
STATUS_SUBSTITUTE(
678+
InvalidArgument, "Index $0 not found in index_map. Current schema is $1",
679+
index_table_id, our_schema_version),
680+
TabletServerErrorPB::OPERATION_NOT_SUPPORTED, &context);
681+
return;
673682
}
674683
}
675684

@@ -686,8 +695,6 @@ void TabletServiceAdminImpl::BackfillIndex(
686695
return;
687696
}
688697

689-
uint32_t our_schema_version = tablet.peer->tablet_metadata()->schema_version();
690-
uint32_t their_schema_version = req->schema_version();
691698
DCHECK_NE(our_schema_version, their_schema_version);
692699
SetupErrorAndRespond(
693700
resp->mutable_error(),

0 commit comments

Comments
 (0)