Skip to content

Commit 24926f6

Browse files
authored
Enable quoting disk space usage by storage pool kind (#2678)
KIKIMR-18302 Users pay differently for HDD and SSD storage. They create storage pools differentiated by the underlying storage kind for their databases. Moreover, they can specify the preferred storage kind for each column in a table (see [column groups](https://ydb.tech/docs/en/yql/reference/syntax/create_table#column-family) in the docs for the CREATE TABLE statement). However, up until this PR they didn't know, how much storage was used on each of the storage pool kinds. (And we didn't have storage pool kinds quotas to disable writes to the database, which exceeded the limit on one of its storage pools.) We would like to provide users with an aggregate of the disk space usage of the database so they can order additional disks before the space is physically depleted. This is done by aggregating the [by channel disk space usage statistics](https://github.com/ydb-platform/ydb/blob/7a673cf01feefbe95bf5e7396d9179a5f283aeba/ydb/core/protos/table_stats.proto#L57) that the SchemeShard receives from the data shards (see [TEvPeriodicTableStats](https://github.com/ydb-platform/ydb/blob/7a673cf01feefbe95bf5e7396d9179a5f283aeba/ydb/core/protos/tx_datashard.proto#L789)). Channels are mapped to the corresponding storage pool kinds via the information that the SchemeShard has about the database (in code databases are subdomains) and the storage pools it was created with. Aggregation is done on two levels: by tables and by database. Aggregate by the table path can be seen in the UI in the path description of the table under the Describe -> PathDescription -> TableStats -> StoragePools field. Aggregate by the database can be seen in the UI in the Describe -> PathDescription -> DomainDescription -> DiskSpaceUsage -> StoragePoolsUsage field. In addition, we implemented "storage_pools_quotas" that the user can specify in the "DatabaseQuotas" section of the config of the database that the user would like to create. There are 3 parameters in each [storage pool quota](https://github.com/jepett0/ydb/blob/a19c3b4dcc28fb1da6d04ecfb139ffdfe90c72fb/ydb/public/api/protos/ydb_cms.proto#L98): - pool kind, - hard quota (if any storage pool exceeds its hard quota, writes to the **whole** database (not just the storage pool that has exceeded the quota!) are restricted), - soft quota (if all storage pools use less storage than the corresponding soft quota, then the database opens for writing again). "storage_pools_quotas" can be used together with the existing ["data_size_hard_quota"](https://github.com/jepett0/ydb/blob/a19c3b4dcc28fb1da6d04ecfb139ffdfe90c72fb/ydb/public/api/protos/ydb_cms.proto#L82) and ["data_size_soft_quota"](https://github.com/jepett0/ydb/blob/a19c3b4dcc28fb1da6d04ecfb139ffdfe90c72fb/ydb/public/api/protos/ydb_cms.proto#L88) that do not differentiate between storage pools. Exceedance of __any__ hard quota (either the storage pool one, or the entire "data_size_hard_quota") disables writes to the database. To reenable writes, __all__ disk space usage (either the [storage pool one](https://github.com/jepett0/ydb/blob/a19c3b4dcc28fb1da6d04ecfb139ffdfe90c72fb/ydb/core/tx/schemeshard/schemeshard_info_types.h#L1460), or the aggregated [TotalSize](https://github.com/jepett0/ydb/blob/a19c3b4dcc28fb1da6d04ecfb139ffdfe90c72fb/ydb/core/tx/schemeshard/schemeshard_info_types.h#L1452)) must be below the corresponding soft quota. One important thing to note about the storage pools usage statistics is that it is delivered to the SchemeShard with a considerable delay (about 1 minute). This means that the storage pools usage will be checked against the storage pools quotas with a delay and some data can be written above the hard limit. (And the other way around too: deleting some data to open the database for writes will be noticed by the SchemeShard with a considerable delay (about 420 seconds in my tests with a default compaction policy, I don't know where this number comes from). This is due to the fact that the new data is stored in the LSM tree (I guess) and is written to the appropriate storage pool later, after compaction.
1 parent d8f9dd4 commit 24926f6

24 files changed

+1382
-125
lines changed

ydb/core/cms/console/console__create_tenant.cpp

Lines changed: 20 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -270,7 +270,26 @@ class TTenantsManager::TTxCreateTenant : public TTransactionBase<TTenantsManager
270270
auto hardQuota = quotas.data_size_hard_quota();
271271
auto softQuota = quotas.data_size_soft_quota();
272272
if (hardQuota && softQuota && hardQuota < softQuota) {
273-
return Error(Ydb::StatusIds::BAD_REQUEST, "Data size soft quota cannot be larger than hard quota", ctx);
273+
return Error(Ydb::StatusIds::BAD_REQUEST,
274+
TStringBuilder() << "Overall data size soft quota (" << softQuota << ")"
275+
<< " of the database " << path
276+
<< " must be smaller than the hard quota (" << hardQuota << ")",
277+
ctx
278+
);
279+
}
280+
for (const auto& storageQuota : quotas.storage_quotas()) {
281+
const auto unitHardQuota = storageQuota.data_size_hard_quota();
282+
const auto unitSoftQuota = storageQuota.data_size_soft_quota();
283+
if (unitHardQuota && unitSoftQuota && unitHardQuota < unitSoftQuota) {
284+
return Error(Ydb::StatusIds::BAD_REQUEST,
285+
TStringBuilder() << "Data size soft quota (" << unitSoftQuota << ")"
286+
<< " for a " << storageQuota.unit_kind() << " storage unit "
287+
<< " of the database " << path
288+
<< " must be smaller than the corresponding hard quota (" << unitHardQuota << ")",
289+
ctx
290+
);
291+
}
292+
274293
}
275294
Tenant->DatabaseQuotas.ConstructInPlace(quotas);
276295
}

ydb/core/cms/console/console_ut_tenants.cpp

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -90,6 +90,20 @@ TTenantTestConfig DefaultConsoleTestConfig()
9090
return res;
9191
}
9292

93+
TString DefaultDatabaseQuotas() {
94+
return R"(
95+
data_size_hard_quota: 3000
96+
storage_quotas {
97+
unit_kind: "hdd"
98+
data_size_hard_quota: 2000
99+
}
100+
storage_quotas {
101+
unit_kind: "hdd-1"
102+
data_size_hard_quota: 1000
103+
}
104+
)";
105+
}
106+
93107
void CheckAlterTenantSlots(TTenantTestRuntime &runtime, const TString &path,
94108
ui64 generation, Ydb::StatusIds::StatusCode code,
95109
TVector<TSlotRequest> add,
@@ -2027,6 +2041,58 @@ Y_UNIT_TEST_SUITE(TConsoleTests) {
20272041
TTenantTestRuntime runtime(DefaultConsoleTestConfig(), {}, true);
20282042
RunTestAlterTenantTooManyStorageResourcesForRunning(runtime);
20292043
}
2044+
2045+
void RunTestDatabaseQuotas(TTenantTestRuntime& runtime, const TString& quotas, bool shared = false) {
2046+
using EType = TCreateTenantRequest::EType;
2047+
2048+
CheckCreateTenant(runtime, Ydb::StatusIds::SUCCESS,
2049+
TCreateTenantRequest(TENANT1_1_NAME, shared ? EType::Shared : EType::Common)
2050+
.WithPools({{"hdd", 1}, {"hdd-1", 1}})
2051+
.WithDatabaseQuotas(quotas)
2052+
);
2053+
2054+
RestartTenantPool(runtime);
2055+
2056+
CheckTenantStatus(runtime, TENANT1_1_NAME, shared, Ydb::StatusIds::SUCCESS,
2057+
Ydb::Cms::GetDatabaseStatusResult::RUNNING,
2058+
{{"hdd", 1, 1}, {"hdd-1", 1, 1}}, {});
2059+
}
2060+
2061+
Y_UNIT_TEST(TestDatabaseQuotas) {
2062+
TTenantTestRuntime runtime(DefaultConsoleTestConfig());
2063+
RunTestDatabaseQuotas(runtime, DefaultDatabaseQuotas());
2064+
}
2065+
2066+
Y_UNIT_TEST(TestDatabaseQuotasBadOverallQuota) {
2067+
TTenantTestRuntime runtime(DefaultConsoleTestConfig());
2068+
2069+
CheckCreateTenant(runtime, Ydb::StatusIds::BAD_REQUEST,
2070+
TCreateTenantRequest(TENANT1_1_NAME, TCreateTenantRequest::EType::Common)
2071+
.WithPools({{"hdd", 1}})
2072+
.WithDatabaseQuotas(R"(
2073+
data_size_hard_quota: 1
2074+
data_size_soft_quota: 1000
2075+
)"
2076+
)
2077+
);
2078+
}
2079+
2080+
Y_UNIT_TEST(TestDatabaseQuotasBadStorageQuota) {
2081+
TTenantTestRuntime runtime(DefaultConsoleTestConfig());
2082+
2083+
CheckCreateTenant(runtime, Ydb::StatusIds::BAD_REQUEST,
2084+
TCreateTenantRequest(TENANT1_1_NAME, TCreateTenantRequest::EType::Common)
2085+
.WithPools({{"hdd", 1}})
2086+
.WithDatabaseQuotas(R"(
2087+
storage_quotas {
2088+
unit_kind: "hdd"
2089+
data_size_hard_quota: 1
2090+
data_size_soft_quota: 1000
2091+
}
2092+
)"
2093+
)
2094+
);
2095+
}
20302096
}
20312097

20322098
} // namespace NKikimr

ydb/core/protos/subdomains.proto

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,8 +61,17 @@ message TDiskSpaceUsage {
6161
optional uint64 UsedReserveSize = 4;
6262
}
6363

64+
message TStoragePoolUsage {
65+
// in bytes
66+
optional string PoolKind = 1;
67+
optional uint64 TotalSize = 2;
68+
optional uint64 DataSize = 3;
69+
optional uint64 IndexSize = 4;
70+
}
71+
6472
optional TTables Tables = 1;
6573
optional TTopics Topics = 2;
74+
repeated TStoragePoolUsage StoragePoolsUsage = 3;
6675
}
6776

6877
message TDomainState {

ydb/core/protos/table_stats.proto

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,15 @@ message TChannelStats {
1616
optional uint64 IndexSize = 3;
1717
}
1818

19+
message TStoragePoolsStats {
20+
message TPoolUsage {
21+
optional string PoolKind = 1;
22+
optional uint64 DataSize = 2;
23+
optional uint64 IndexSize = 3;
24+
}
25+
repeated TPoolUsage PoolsUsage = 1;
26+
}
27+
1928
message TTableStats {
2029
optional uint64 DataSize = 1; // both inMem and ondisk
2130
optional uint64 RowCount = 2; // both inMem and ondisk
@@ -55,4 +64,6 @@ message TTableStats {
5564
optional bool HasLoanedParts = 29;
5665

5766
repeated TChannelStats Channels = 30;
67+
68+
optional TStoragePoolsStats StoragePools = 31;
5869
}

ydb/core/testlib/tenant_helpers.h

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,7 @@ struct TCreateTenantRequest {
9696
TString Path;
9797
EType Type;
9898
TAttrsCont Attrs;
99+
Ydb::Cms::DatabaseQuotas DatabaseQuotas;
99100
// Common & Shared
100101
TPoolsCont Pools;
101102
TSlotsCont Slots;
@@ -116,6 +117,18 @@ struct TCreateTenantRequest {
116117
return *this;
117118
}
118119

120+
TSelf& WithDatabaseQuotas(const Ydb::Cms::DatabaseQuotas& quotas) {
121+
DatabaseQuotas = quotas;
122+
return *this;
123+
}
124+
125+
TSelf& WithDatabaseQuotas(const TString& quotas) {
126+
Ydb::Cms::DatabaseQuotas parsedQuotas;
127+
UNIT_ASSERT_C(NProtoBuf::TextFormat::ParseFromString(quotas, &parsedQuotas), quotas);
128+
DatabaseQuotas = std::move(parsedQuotas);
129+
return *this;
130+
}
131+
119132
TSelf& WithPools(const TPoolsCont& pools) {
120133
if (Type == EType::Unspecified) {
121134
Type = EType::Common;
@@ -340,14 +353,16 @@ inline void CheckCreateTenant(TTenantTestRuntime &runtime,
340353
if (request.PlanResolution) {
341354
event->Record.MutableRequest()->mutable_options()->set_plan_resolution(request.PlanResolution);
342355
}
356+
357+
event->Record.MutableRequest()->mutable_database_quotas()->CopyFrom(request.DatabaseQuotas);
343358

344359
TAutoPtr<IEventHandle> handle;
345360
runtime.SendToConsole(event);
346361
auto reply = runtime.GrabEdgeEventRethrow<NConsole::TEvConsole::TEvCreateTenantResponse>(handle);
347362
auto &operation = reply->Record.GetResponse().operation();
348363

349364
if (operation.ready()) {
350-
UNIT_ASSERT_VALUES_EQUAL(operation.status(), code);
365+
UNIT_ASSERT_VALUES_EQUAL_C(operation.status(), code, operation.DebugString());
351366
} else {
352367
TString id = operation.id();
353368
auto *request = new NConsole::TEvConsole::TEvNotifyOperationCompletionRequest;

ydb/core/tx/datashard/ut_common/datashard_ut_common.cpp

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1831,7 +1831,12 @@ void ExecSQL(Tests::TServer::TPtr server,
18311831
auto request = MakeSQLRequest(sql, dml);
18321832
runtime.Send(new IEventHandle(NKqp::MakeKqpProxyID(runtime.GetNodeId()), sender, request.Release(), 0, 0, nullptr));
18331833
auto ev = runtime.GrabEdgeEventRethrow<NKqp::TEvKqp::TEvQueryResponse>(sender);
1834-
UNIT_ASSERT_VALUES_EQUAL(ev->Get()->Record.GetRef().GetYdbStatus(), code);
1834+
auto& response = ev->Get()->Record.GetRef();
1835+
auto& issues = response.GetResponse().GetQueryIssues();
1836+
UNIT_ASSERT_VALUES_EQUAL_C(response.GetYdbStatus(),
1837+
code,
1838+
issues.empty() ? response.DebugString() : issues.Get(0).DebugString()
1839+
);
18351840
}
18361841

18371842
std::unique_ptr<NEvents::TDataEvents::TEvWrite> MakeWriteRequest(ui64 txId, NKikimrDataEvents::TEvWrite::ETxMode txMode, NKikimrDataEvents::TEvWrite_TOperation::EOperationType operationType, const TTableId& tableId, const TVector<TShardedTableOptions::TColumn>& columns, ui32 rowCount, ui64 seed) {

ydb/core/tx/schemeshard/schemeshard__init.cpp

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2259,6 +2259,22 @@ struct TSchemeShard::TTxInit : public TTransactionBase<TSchemeShard> {
22592259
stats.RowCount = rowSet.GetValue<Schema::TablePartitionStats::RowCount>();
22602260
stats.DataSize = rowSet.GetValue<Schema::TablePartitionStats::DataSize>();
22612261
stats.IndexSize = rowSet.GetValue<Schema::TablePartitionStats::IndexSize>();
2262+
if (rowSet.HaveValue<Schema::TablePartitionStats::StoragePoolsStats>()) {
2263+
NKikimrTableStats::TStoragePoolsStats protobufRepresentation;
2264+
Y_ABORT_UNLESS(ParseFromStringNoSizeLimit(
2265+
protobufRepresentation,
2266+
rowSet.GetValue<Schema::TablePartitionStats::StoragePoolsStats>()
2267+
)
2268+
);
2269+
for (const auto& poolUsage : protobufRepresentation.GetPoolsUsage()) {
2270+
stats.StoragePoolsStats.emplace(
2271+
poolUsage.GetPoolKind(),
2272+
TPartitionStats::TStoragePoolStats{poolUsage.GetDataSize(),
2273+
poolUsage.GetIndexSize()
2274+
}
2275+
);
2276+
}
2277+
}
22622278

22632279
stats.LastAccessTime = TInstant::FromValue(rowSet.GetValue<Schema::TablePartitionStats::LastAccessTime>());
22642280
stats.LastUpdateTime = TInstant::FromValue(rowSet.GetValue<Schema::TablePartitionStats::LastUpdateTime>());

ydb/core/tx/schemeshard/schemeshard__operation_alter_extsubdomain.cpp

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -258,6 +258,16 @@ VerifyParams(TParamsDelta* delta, const TPathId pathId, const TSubDomainInfo::TP
258258
}
259259
}
260260

261+
// storage pools quotas check
262+
TString error;
263+
if (const auto& effectivePools = requestedPools.empty()
264+
? actualPools
265+
: requestedPools;
266+
!CheckStorageQuotasKinds(input.GetDatabaseQuotas(), effectivePools, pathId.ToString(), error)
267+
) {
268+
return paramError(error);
269+
}
270+
261271
std::set_difference(requestedPools.begin(), requestedPools.end(),
262272
actualPools.begin(), actualPools.end(),
263273
std::back_inserter(storagePoolsAdded));

ydb/core/tx/schemeshard/schemeshard__operation_alter_subdomain.cpp

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -287,6 +287,14 @@ class TAlterSubDomain: public TSubOperation {
287287
}
288288

289289
if (settings.HasDatabaseQuotas()) {
290+
if (const auto& effectivePools = requestedPools.empty()
291+
? actualPools
292+
: requestedPools;
293+
!CheckStorageQuotasKinds(settings.GetDatabaseQuotas(), effectivePools, path.PathString(), errStr)
294+
) {
295+
result->SetError(NKikimrScheme::StatusInvalidParameter, errStr);
296+
return result;
297+
}
290298
alterData->SetDatabaseQuotas(settings.GetDatabaseQuotas());
291299
}
292300

ydb/core/tx/schemeshard/schemeshard__operation_common_subdomain.h

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,37 @@
66
namespace NKikimr {
77
namespace NSchemeShard {
88

9+
inline bool CheckStorageQuotasKinds(const Ydb::Cms::DatabaseQuotas& quotas,
10+
const TVector<TStoragePool>& pools,
11+
const TString& path,
12+
TString& error
13+
) {
14+
TVector<TString> quotedKinds;
15+
for (const auto& storageQuota : quotas.storage_quotas()) {
16+
quotedKinds.emplace_back(storageQuota.unit_kind());
17+
}
18+
Sort(quotedKinds);
19+
const auto uniqueEnd = Unique(quotedKinds.begin(), quotedKinds.end());
20+
if (uniqueEnd != quotedKinds.end()) {
21+
error = TStringBuilder()
22+
<< "Malformed subdomain request: storage quotas' unit kinds must be unique, but "
23+
<< *uniqueEnd << " appears twice in the storage quotas definition of the " << path << " subdomain.";
24+
return false;
25+
}
26+
27+
for (const auto& quotedKind : quotedKinds) {
28+
if (!AnyOf(pools, [&quotedKind](const TStoragePool& pool) {
29+
return pool.GetKind() == quotedKind;
30+
})) {
31+
error = TStringBuilder()
32+
<< "Malformed subdomain request: cannot set a " << quotedKind << " storage quota, "
33+
<< "because no storage pool in the subdomain " << path << " has the specified kind.";
34+
return false;
35+
}
36+
}
37+
return true;
38+
}
39+
940
namespace NSubDomainState {
1041

1142
class TConfigureParts: public TSubOperationState {

0 commit comments

Comments
 (0)