-
Notifications
You must be signed in to change notification settings - Fork 735
Enable quoting disk space usage by storage pool kind #2678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable quoting disk space usage by storage pool kind #2678
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
a19c3b4 to
7009c75
Compare
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
ijon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More explicit tests on the permutation cases of (database, pool, hard, soft quotas, defined, not defined) would be nice
| LOG_DEBUG_S(ctx, NKikimrServices::FLAT_TX_SCHEMESHARD, | ||
| "Got periodic table stats at tablet " << TabletID() | ||
| << " from shard " << datashardId | ||
| << " pathId " << pathId | ||
| << " raw table stats:\n" << tableStats.DebugString()); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it purely dev time related output? Should it be removed by now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was helpful during the development and I think, there is no such message in the debug logs now. I would like to leave it. It is debug level log, so it should not bother others much
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed level to TRACE and made it ShortDebugString() to make it use less space in the output log
7009c75 to
958757e
Compare
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
627815f to
7afab1c
Compare
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
7afab1c to
ee5f289
Compare
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
snaury
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM.
5cc7e0d to
fce55f8
Compare
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
fce55f8 to
8af4a09
Compare
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
- change log level to trace to not pollute the SchemeShard log - change comment wording to emphasize the real values seen in practice
8af4a09 to
5d15404
Compare
|
⚪
|
|
⚪
|
KIKIMR-18302 Users pay differently for HDD and SSD storage. They create storage pools differentiated by the underlying storage kind for their databases. Moreover, they can specify the preferred storage kind for each column in a table (see [column groups](https://ydb.tech/docs/en/yql/reference/syntax/create_table#column-family) in the docs for the CREATE TABLE statement). However, up until this PR they didn't know, how much storage was used on each of the storage pool kinds. (And we didn't have storage pool kinds quotas to disable writes to the database, which exceeded the limit on one of its storage pools.) We would like to provide users with an aggregate of the disk space usage of the database so they can order additional disks before the space is physically depleted. This is done by aggregating the [by channel disk space usage statistics](https://github.com/ydb-platform/ydb/blob/7a673cf01feefbe95bf5e7396d9179a5f283aeba/ydb/core/protos/table_stats.proto#L57) that the SchemeShard receives from the data shards (see [TEvPeriodicTableStats](https://github.com/ydb-platform/ydb/blob/7a673cf01feefbe95bf5e7396d9179a5f283aeba/ydb/core/protos/tx_datashard.proto#L789)). Channels are mapped to the corresponding storage pool kinds via the information that the SchemeShard has about the database (in code databases are subdomains) and the storage pools it was created with. Aggregation is done on two levels: by tables and by database. Aggregate by the table path can be seen in the UI in the path description of the table under the Describe -> PathDescription -> TableStats -> StoragePools field. Aggregate by the database can be seen in the UI in the Describe -> PathDescription -> DomainDescription -> DiskSpaceUsage -> StoragePoolsUsage field. In addition, we implemented "storage_pools_quotas" that the user can specify in the "DatabaseQuotas" section of the config of the database that the user would like to create. There are 3 parameters in each [storage pool quota](https://github.com/jepett0/ydb/blob/a19c3b4dcc28fb1da6d04ecfb139ffdfe90c72fb/ydb/public/api/protos/ydb_cms.proto#L98): - pool kind, - hard quota (if any storage pool exceeds its hard quota, writes to the **whole** database (not just the storage pool that has exceeded the quota!) are restricted), - soft quota (if all storage pools use less storage than the corresponding soft quota, then the database opens for writing again). "storage_pools_quotas" can be used together with the existing ["data_size_hard_quota"](https://github.com/jepett0/ydb/blob/a19c3b4dcc28fb1da6d04ecfb139ffdfe90c72fb/ydb/public/api/protos/ydb_cms.proto#L82) and ["data_size_soft_quota"](https://github.com/jepett0/ydb/blob/a19c3b4dcc28fb1da6d04ecfb139ffdfe90c72fb/ydb/public/api/protos/ydb_cms.proto#L88) that do not differentiate between storage pools. Exceedance of __any__ hard quota (either the storage pool one, or the entire "data_size_hard_quota") disables writes to the database. To reenable writes, __all__ disk space usage (either the [storage pool one](https://github.com/jepett0/ydb/blob/a19c3b4dcc28fb1da6d04ecfb139ffdfe90c72fb/ydb/core/tx/schemeshard/schemeshard_info_types.h#L1460), or the aggregated [TotalSize](https://github.com/jepett0/ydb/blob/a19c3b4dcc28fb1da6d04ecfb139ffdfe90c72fb/ydb/core/tx/schemeshard/schemeshard_info_types.h#L1452)) must be below the corresponding soft quota. One important thing to note about the storage pools usage statistics is that it is delivered to the SchemeShard with a considerable delay (about 1 minute). This means that the storage pools usage will be checked against the storage pools quotas with a delay and some data can be written above the hard limit. (And the other way around too: deleting some data to open the database for writes will be noticed by the SchemeShard with a considerable delay (about 420 seconds in my tests with a default compaction policy, I don't know where this number comes from). This is due to the fact that the new data is stored in the LSM tree (I guess) and is written to the appropriate storage pool later, after compaction.
KIKIMR-18302 Users pay differently for HDD and SSD storage. They create storage pools differentiated by the underlying storage kind for their databases. Moreover, they can specify the preferred storage kind for each column in a table (see [column groups](https://ydb.tech/docs/en/yql/reference/syntax/create_table#column-family) in the docs for the CREATE TABLE statement). However, up until this PR they didn't know, how much storage was used on each of the storage pool kinds. (And we didn't have storage pool kinds quotas to disable writes to the database, which exceeded the limit on one of its storage pools.) We would like to provide users with an aggregate of the disk space usage of the database so they can order additional disks before the space is physically depleted. This is done by aggregating the [by channel disk space usage statistics](https://github.com/ydb-platform/ydb/blob/7a673cf01feefbe95bf5e7396d9179a5f283aeba/ydb/core/protos/table_stats.proto#L57) that the SchemeShard receives from the data shards (see [TEvPeriodicTableStats](https://github.com/ydb-platform/ydb/blob/7a673cf01feefbe95bf5e7396d9179a5f283aeba/ydb/core/protos/tx_datashard.proto#L789)). Channels are mapped to the corresponding storage pool kinds via the information that the SchemeShard has about the database (in code databases are subdomains) and the storage pools it was created with. Aggregation is done on two levels: by tables and by database. Aggregate by the table path can be seen in the UI in the path description of the table under the Describe -> PathDescription -> TableStats -> StoragePools field. Aggregate by the database can be seen in the UI in the Describe -> PathDescription -> DomainDescription -> DiskSpaceUsage -> StoragePoolsUsage field. In addition, we implemented "storage_pools_quotas" that the user can specify in the "DatabaseQuotas" section of the config of the database that the user would like to create. There are 3 parameters in each [storage pool quota](https://github.com/jepett0/ydb/blob/a19c3b4dcc28fb1da6d04ecfb139ffdfe90c72fb/ydb/public/api/protos/ydb_cms.proto#L98): - pool kind, - hard quota (if any storage pool exceeds its hard quota, writes to the **whole** database (not just the storage pool that has exceeded the quota!) are restricted), - soft quota (if all storage pools use less storage than the corresponding soft quota, then the database opens for writing again). "storage_pools_quotas" can be used together with the existing ["data_size_hard_quota"](https://github.com/jepett0/ydb/blob/a19c3b4dcc28fb1da6d04ecfb139ffdfe90c72fb/ydb/public/api/protos/ydb_cms.proto#L82) and ["data_size_soft_quota"](https://github.com/jepett0/ydb/blob/a19c3b4dcc28fb1da6d04ecfb139ffdfe90c72fb/ydb/public/api/protos/ydb_cms.proto#L88) that do not differentiate between storage pools. Exceedance of __any__ hard quota (either the storage pool one, or the entire "data_size_hard_quota") disables writes to the database. To reenable writes, __all__ disk space usage (either the [storage pool one](https://github.com/jepett0/ydb/blob/a19c3b4dcc28fb1da6d04ecfb139ffdfe90c72fb/ydb/core/tx/schemeshard/schemeshard_info_types.h#L1460), or the aggregated [TotalSize](https://github.com/jepett0/ydb/blob/a19c3b4dcc28fb1da6d04ecfb139ffdfe90c72fb/ydb/core/tx/schemeshard/schemeshard_info_types.h#L1452)) must be below the corresponding soft quota. One important thing to note about the storage pools usage statistics is that it is delivered to the SchemeShard with a considerable delay (about 1 minute). This means that the storage pools usage will be checked against the storage pools quotas with a delay and some data can be written above the hard limit. (And the other way around too: deleting some data to open the database for writes will be noticed by the SchemeShard with a considerable delay (about 420 seconds in my tests with a default compaction policy, I don't know where this number comes from). This is due to the fact that the new data is stored in the LSM tree (I guess) and is written to the appropriate storage pool later, after compaction.
KIKIMR-18302
Users pay differently for HDD and SSD storage. They create storage pools differentiated by the underlying storage kind for their databases. Moreover, they can specify the preferred storage kind for each column in a table (see column groups in the docs for the CREATE TABLE statement).
However, up until this PR they didn't know, how much storage was used on each of the storage pool kinds. (And we didn't have storage pool kinds quotas to disable writes to the database, which exceeded the limit on one of its storage pools.)
We would like to provide users with an aggregate of the disk space usage of the database so they can order additional disks before the space is physically depleted. This is done by aggregating the by channel disk space usage statistics that the SchemeShard receives from the data shards (see TEvPeriodicTableStats). Channels are mapped to the corresponding storage pool kinds via the information that the SchemeShard has about the database (in code databases are subdomains) and the storage pools it was created with. Aggregation is done on two levels: by tables and by database. Aggregate by the table path can be seen in the UI in the path description of the table under the Describe -> PathDescription -> TableStats -> StoragePools field. Aggregate by the database can be seen in the UI in the Describe -> PathDescription -> DomainDescription -> DiskSpaceUsage -> StoragePoolsUsage field.
In addition, we implemented "storage_pools_quotas" that the user can specify in the "DatabaseQuotas" section of the config of the database that the user would like to create. There are 3 parameters in each storage pool quota:
"storage_pools_quotas" can be used together with the existing "data_size_hard_quota" and "data_size_soft_quota" that do not differentiate between storage pools. Exceedance of any hard quota (either the storage pool one, or the entire "data_size_hard_quota") disables writes to the database. To reenable writes, all disk space usage (either the storage pool one, or the aggregated TotalSize) must be below the corresponding soft quota.
One important thing to note about the storage pools usage statistics is that it is delivered to the SchemeShard with a considerable delay (about 1 minute). This means that the storage pools usage will be checked against the storage pools quotas with a delay and some data can be written above the hard limit. (And the other way around too: deleting some data to open the database for writes will be noticed by the SchemeShard with a considerable delay (about 420 seconds in my tests with a default compaction policy, I don't know where this number comes from). This is due to the fact that the new data is stored in the LSM tree (I guess) and is written to the appropriate storage pool later, after compaction.