Enable quoting disk space usage by storage pool kind #2678

jepett0 · 2024-03-12T17:52:26Z

KIKIMR-18302

Users pay differently for HDD and SSD storage. They create storage pools differentiated by the underlying storage kind for their databases. Moreover, they can specify the preferred storage kind for each column in a table (see column groups in the docs for the CREATE TABLE statement).

However, up until this PR they didn't know, how much storage was used on each of the storage pool kinds. (And we didn't have storage pool kinds quotas to disable writes to the database, which exceeded the limit on one of its storage pools.)

We would like to provide users with an aggregate of the disk space usage of the database so they can order additional disks before the space is physically depleted. This is done by aggregating the by channel disk space usage statistics that the SchemeShard receives from the data shards (see TEvPeriodicTableStats). Channels are mapped to the corresponding storage pool kinds via the information that the SchemeShard has about the database (in code databases are subdomains) and the storage pools it was created with. Aggregation is done on two levels: by tables and by database. Aggregate by the table path can be seen in the UI in the path description of the table under the Describe -> PathDescription -> TableStats -> StoragePools field. Aggregate by the database can be seen in the UI in the Describe -> PathDescription -> DomainDescription -> DiskSpaceUsage -> StoragePoolsUsage field.

In addition, we implemented "storage_pools_quotas" that the user can specify in the "DatabaseQuotas" section of the config of the database that the user would like to create. There are 3 parameters in each storage pool quota:

pool kind,
hard quota (if any storage pool exceeds its hard quota, writes to the whole database (not just the storage pool that has exceeded the quota!) are restricted),
soft quota (if all storage pools use less storage than the corresponding soft quota, then the database opens for writing again).

"storage_pools_quotas" can be used together with the existing "data_size_hard_quota" and "data_size_soft_quota" that do not differentiate between storage pools. Exceedance of any hard quota (either the storage pool one, or the entire "data_size_hard_quota") disables writes to the database. To reenable writes, all disk space usage (either the storage pool one, or the aggregated TotalSize) must be below the corresponding soft quota.

One important thing to note about the storage pools usage statistics is that it is delivered to the SchemeShard with a considerable delay (about 1 minute). This means that the storage pools usage will be checked against the storage pools quotas with a delay and some data can be written above the hard limit. (And the other way around too: deleting some data to open the database for writes will be noticed by the SchemeShard with a considerable delay (about 420 seconds in my tests with a default compaction policy, I don't know where this number comes from). This is due to the fact that the new data is stored in the LSM tree (I guess) and is written to the appropriate storage pool later, after compaction.

ydb/core/tx/schemeshard/schemeshard__table_stats.cpp

ydb/core/tx/schemeshard/schemeshard_info_types.cpp

ydb/core/tx/schemeshard/schemeshard__table_stats.cpp

ydb/core/tx/schemeshard/schemeshard_info_types.cpp

ydb/public/api/protos/ydb_cms.proto

ijon

More explicit tests on the permutation cases of (database, pool, hard, soft quotas, defined, not defined) would be nice

ydb/core/tx/schemeshard/ut_subdomain/ut_subdomain.cpp

ydb/core/cms/console/console__create_tenant.cpp

ydb/core/protos/subdomains.proto

ydb/core/tx/schemeshard/schemeshard_info_types.h

ijon · 2024-03-15T15:47:31Z

ydb/core/tx/schemeshard/schemeshard__table_stats.cpp

+    LOG_DEBUG_S(ctx, NKikimrServices::FLAT_TX_SCHEMESHARD,
+                "Got periodic table stats at tablet " << TabletID()
+                                                     << " from shard " << datashardId
+                                                     << " pathId " << pathId
+                                                     << " raw table stats:\n" << tableStats.DebugString());
+


Is it purely dev time related output? Should it be removed by now?

It was helpful during the development and I think, there is no such message in the debug logs now. I would like to leave it. It is debug level log, so it should not bother others much

Changed level to TRACE and made it ShortDebugString() to make it use less space in the output log

snaury

Mostly LGTM.

ydb/public/api/protos/ydb_cms.proto

- change log level to trace to not pollute the SchemeShard log - change comment wording to emphasize the real values seen in practice

github-actions · 2024-04-03T08:39:53Z

⚪ 2024-04-03 08:39:52 UTC Pre-commit check for bc745ee has started.
⚪ 2024-04-03 08:39:55 UTC Build linux-x86_64-release-asan is running...
🟢 2024-04-03 09:18:47 UTC Build successful.
⚪ 2024-04-03 09:20:37 UTC Tests are running...
🔴 2024-04-03 10:56:31 UTC Some tests failed, follow the links below.

Test history

TESTS	PASSED	ERRORS	FAILED	SKIPPED	MUTED^?
14229	13932	0	56	220	21

github-actions · 2024-04-03T08:42:10Z

⚪ 2024-04-03 08:42:09 UTC Pre-commit check for bc745ee has started.
⚪ 2024-04-03 08:42:12 UTC Build linux-x86_64-relwithdebinfo is running...
🟢 2024-04-03 09:20:57 UTC Build successful.
⚪ 2024-04-03 09:22:45 UTC Tests are running...
🔴 2024-04-03 11:10:49 UTC Some tests failed, follow the links below.

Test history

TESTS	PASSED	ERRORS	FAILED	SKIPPED	MUTED^?
69153	57992	0	20	11117	24

github-actions · 2024-04-03T08:42:11Z

⚪ 2024-04-03 08:42:10 UTC Pre-commit check for bc745ee has started.
⚪ 2024-04-03 08:42:13 UTC Build linux-x86_64-release-clang14 is running...
🟢 2024-04-03 09:15:37 UTC Build successful.

KIKIMR-18302 Users pay differently for HDD and SSD storage. They create storage pools differentiated by the underlying storage kind for their databases. Moreover, they can specify the preferred storage kind for each column in a table (see [column groups](https://ydb.tech/docs/en/yql/reference/syntax/create_table#column-family) in the docs for the CREATE TABLE statement). However, up until this PR they didn't know, how much storage was used on each of the storage pool kinds. (And we didn't have storage pool kinds quotas to disable writes to the database, which exceeded the limit on one of its storage pools.) We would like to provide users with an aggregate of the disk space usage of the database so they can order additional disks before the space is physically depleted. This is done by aggregating the [by channel disk space usage statistics](https://github.com/ydb-platform/ydb/blob/7a673cf01feefbe95bf5e7396d9179a5f283aeba/ydb/core/protos/table_stats.proto#L57) that the SchemeShard receives from the data shards (see [TEvPeriodicTableStats](https://github.com/ydb-platform/ydb/blob/7a673cf01feefbe95bf5e7396d9179a5f283aeba/ydb/core/protos/tx_datashard.proto#L789)). Channels are mapped to the corresponding storage pool kinds via the information that the SchemeShard has about the database (in code databases are subdomains) and the storage pools it was created with. Aggregation is done on two levels: by tables and by database. Aggregate by the table path can be seen in the UI in the path description of the table under the Describe -> PathDescription -> TableStats -> StoragePools field. Aggregate by the database can be seen in the UI in the Describe -> PathDescription -> DomainDescription -> DiskSpaceUsage -> StoragePoolsUsage field. In addition, we implemented "storage_pools_quotas" that the user can specify in the "DatabaseQuotas" section of the config of the database that the user would like to create. There are 3 parameters in each [storage pool quota](https://github.com/jepett0/ydb/blob/a19c3b4dcc28fb1da6d04ecfb139ffdfe90c72fb/ydb/public/api/protos/ydb_cms.proto#L98): - pool kind, - hard quota (if any storage pool exceeds its hard quota, writes to the **whole** database (not just the storage pool that has exceeded the quota!) are restricted), - soft quota (if all storage pools use less storage than the corresponding soft quota, then the database opens for writing again). "storage_pools_quotas" can be used together with the existing ["data_size_hard_quota"](https://github.com/jepett0/ydb/blob/a19c3b4dcc28fb1da6d04ecfb139ffdfe90c72fb/ydb/public/api/protos/ydb_cms.proto#L82) and ["data_size_soft_quota"](https://github.com/jepett0/ydb/blob/a19c3b4dcc28fb1da6d04ecfb139ffdfe90c72fb/ydb/public/api/protos/ydb_cms.proto#L88) that do not differentiate between storage pools. Exceedance of __any__ hard quota (either the storage pool one, or the entire "data_size_hard_quota") disables writes to the database. To reenable writes, __all__ disk space usage (either the [storage pool one](https://github.com/jepett0/ydb/blob/a19c3b4dcc28fb1da6d04ecfb139ffdfe90c72fb/ydb/core/tx/schemeshard/schemeshard_info_types.h#L1460), or the aggregated [TotalSize](https://github.com/jepett0/ydb/blob/a19c3b4dcc28fb1da6d04ecfb139ffdfe90c72fb/ydb/core/tx/schemeshard/schemeshard_info_types.h#L1452)) must be below the corresponding soft quota. One important thing to note about the storage pools usage statistics is that it is delivered to the SchemeShard with a considerable delay (about 1 minute). This means that the storage pools usage will be checked against the storage pools quotas with a delay and some data can be written above the hard limit. (And the other way around too: deleting some data to open the database for writes will be noticed by the SchemeShard with a considerable delay (about 420 seconds in my tests with a default compaction policy, I don't know where this number comes from). This is due to the fact that the new data is stored in the LSM tree (I guess) and is written to the appropriate storage pool later, after compaction.

This comment was marked as outdated.

Sign in to view

jepett0 commented Mar 13, 2024

View reviewed changes

ydb/core/tx/schemeshard/schemeshard__table_stats.cpp Outdated Show resolved Hide resolved

jepett0 commented Mar 13, 2024

View reviewed changes

ydb/core/tx/schemeshard/schemeshard__table_stats.cpp Outdated Show resolved Hide resolved

jepett0 commented Mar 13, 2024

View reviewed changes

ydb/core/tx/schemeshard/schemeshard__table_stats.cpp Outdated Show resolved Hide resolved

jepett0 commented Mar 13, 2024

View reviewed changes

ydb/core/tx/schemeshard/schemeshard__table_stats.cpp Outdated Show resolved Hide resolved

jepett0 commented Mar 13, 2024

View reviewed changes

ydb/core/tx/schemeshard/schemeshard_info_types.cpp Outdated Show resolved Hide resolved

jepett0 commented Mar 13, 2024

View reviewed changes

ydb/core/tx/schemeshard/schemeshard_info_types.cpp Outdated Show resolved Hide resolved

jepett0 requested review from ijon and snaury March 13, 2024 13:09

snaury reviewed Mar 13, 2024

View reviewed changes

jepett0 force-pushed the SchemeShard.per_channel_storage_limits.1 branch from a19c3b4 to 7009c75 Compare March 14, 2024 09:07

This comment was marked as outdated.

Sign in to view

ijon requested changes Mar 15, 2024

View reviewed changes

jepett0 force-pushed the SchemeShard.per_channel_storage_limits.1 branch from 7009c75 to 958757e Compare March 25, 2024 07:26

This comment was marked as outdated.

Sign in to view

jepett0 force-pushed the SchemeShard.per_channel_storage_limits.1 branch from 627815f to 7afab1c Compare March 25, 2024 13:04

This comment was marked as outdated.

Sign in to view

jepett0 force-pushed the SchemeShard.per_channel_storage_limits.1 branch from 7afab1c to ee5f289 Compare March 25, 2024 22:17

This comment was marked as outdated.

Sign in to view

snaury reviewed Mar 28, 2024

View reviewed changes

ydb/public/api/protos/ydb_cms.proto Outdated Show resolved Hide resolved

jepett0 force-pushed the SchemeShard.per_channel_storage_limits.1 branch from 5cc7e0d to fce55f8 Compare April 1, 2024 06:53

This comment was marked as outdated.

Sign in to view

jepett0 force-pushed the SchemeShard.per_channel_storage_limits.1 branch from fce55f8 to 8af4a09 Compare April 1, 2024 10:49

This comment was marked as outdated.

Sign in to view

jepett0 added 4 commits April 3, 2024 08:38

Enable quoting disk space usage by storage pool kind

c29c5b3

Persist storage pool usage stats in table partition stats

6d9a694

Add storage pools stats persistence test

68232c2

Minor changes

5d15404

- change log level to trace to not pollute the SchemeShard log - change comment wording to emphasize the real values seen in practice

jepett0 force-pushed the SchemeShard.per_channel_storage_limits.1 branch from 8af4a09 to 5d15404 Compare April 3, 2024 08:38

ijon approved these changes Apr 3, 2024

View reviewed changes

jepett0 merged commit 24926f6 into ydb-platform:main Apr 3, 2024

This was referenced Apr 6, 2024

Fix kafka read session verify #3522

Merged

Fix kafka read session partitions releases #3528

Merged

StekPerepolnen mentioned this pull request Apr 10, 2024

hc fallback when static group has unknown status #3620

Merged

shnikd mentioned this pull request Apr 11, 2024

Support create sequence #3662

Merged

jepett0 mentioned this pull request Apr 23, 2024

Aggregate storage pool usage statistics by pool name instead of a pool kind #4014

Closed

serbel324 mentioned this pull request Apr 25, 2024

Introduce local SyncLog data cutter #4124

Merged

Enable quoting disk space usage by storage pool kind #2678

Enable quoting disk space usage by storage pool kind #2678

Uh oh!

Conversation

jepett0 commented Mar 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

ijon left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ijon Mar 15, 2024

Choose a reason for hiding this comment

Uh oh!

jepett0 Mar 26, 2024

Choose a reason for hiding this comment

Uh oh!

jepett0 Mar 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

snaury left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

github-actions bot commented Apr 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

jepett0 commented Mar 12, 2024 •

edited

Loading

jepett0 Mar 29, 2024 •

edited

Loading

github-actions bot commented Apr 3, 2024 •

edited

Loading

github-actions bot commented Apr 3, 2024 •

edited

Loading

github-actions bot commented Apr 3, 2024 •

edited

Loading