Skip to content

Commit

Permalink
ttl: Remove ttl_range_concurrency config
Browse files Browse the repository at this point in the history
fixes #89393

see #89392 for benchmarking

To simplify TTL setup, range concurrency is set to min(num_spans, num_cpus)
in each processor instead of letting the user set it.

Release note (sql change): Cluster setting sql.ttl.default_range_concurrency
and table storage param ttl_range_concurrency are no longer configurable.
  • Loading branch information
ecwall committed Oct 19, 2022
1 parent ed1d8f5 commit 83a7b10
Show file tree
Hide file tree
Showing 17 changed files with 185 additions and 150 deletions.
10 changes: 5 additions & 5 deletions docs/RFCS/20220120_row_level_ttl.md
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,8 @@ TTL metadata is stored on the TableDescriptor:
```protobuf
message TableDescriptor {
message RowLevelTTL {
option (gogoproto.equal) = true;
// DurationExpr is the automatically assigned interval for when the TTL should apply to a row.
optional string duration_expr = 1 [(gogoproto.nullable)=false, (gogoproto.casttype)="Expression"];
// SelectBatchSize is the amount of rows that should be fetched at a time
Expand All @@ -147,8 +149,8 @@ message TableDescriptor {
optional string deletion_cron = 4 [(gogoproto.nullable)=false];
// ScheduleID is the ID of the row-level TTL job schedules.
optional int64 schedule_id = 5 [(gogoproto.customname)="ScheduleID",(gogoproto.nullable)=false];
// RangeConcurrency is the number of ranges to process at a time.
optional int64 range_concurrency = 6 [(gogoproto.nullable)=false];
// RangeConcurrency is based on the number of spans and is no longer configurable.
reserved 6;
// DeleteRateLimit is the maximum amount of rows to delete per second.
optional int64 delete_rate_limit = 7 [(gogoproto.nullable)=false];
// Pause is set if the TTL job should not run.
Expand Down Expand Up @@ -180,7 +182,6 @@ the following options to control the TTL job:
| `ttl_expiration_expression` | If set, uses the expression specified as the TTL expiration. Defaults to just using the `crdb_internal_expiration` column. |
| `ttl_select_batch_size` | How many rows to fetch from the range that have expired at a given time. Defaults to 500. Must be at least `1`. |
| `ttl_delete_batch_size` | How many rows to delete at a time. Defaults to 100. Must be at least `1`. |
| `ttl_range_concurrency` | How many concurrent ranges are being worked on at a time. Defaults to `cpu_core_count`. Must be at least `1`. |
| `ttl_delete_rate_limit` | Maximum number of rows to be deleted per second (acts as the rate limit). Defaults to 0 (signifying none). |
| `ttl_row_stats_poll_interval` | Whilst the TTL job is running, counts rows and expired rows on the table to report as prometheus metrics. By default unset, meaning no stats are fetched. |
| `ttl_pause` | Stops the TTL job from executing. |
Expand Down Expand Up @@ -291,8 +292,7 @@ are additional knobs a user can use to control how effective the deletion
performs:
* how often the deletion job runs (controls amount of "junk" data left)
* table GC time (when tombstones are removed and space is therefore reclaimed)
* the size of the ranges on the table, which has knock on effects for
`ttl_range_concurrency`.
* the distribution of the ranges on the table

### Admission Control
To ensure the deletion job does not affect foreground traffic, we plan on using
Expand Down
1 change: 0 additions & 1 deletion docs/generated/settings/settings-for-tenants.txt
Original file line number Diff line number Diff line change
Expand Up @@ -282,7 +282,6 @@ sql.trace.stmt.enable_threshold duration 0s enables tracing on all statements; s
sql.trace.txn.enable_threshold duration 0s enables tracing on all transactions; transactions open for longer than this duration will have their trace logged (set to 0 to disable); note that enabling this may have a negative performance impact; this setting is coarser-grained than sql.trace.stmt.enable_threshold because it applies to all statements within a transaction as well as client communication (e.g. retries)
sql.ttl.default_delete_batch_size integer 100 default amount of rows to delete in a single query during a TTL job
sql.ttl.default_delete_rate_limit integer 0 default delete rate limit for all TTL jobs. Use 0 to signify no rate limit.
sql.ttl.default_range_concurrency integer 1 default amount of ranges to process at once during a TTL delete
sql.ttl.default_select_batch_size integer 500 default amount of rows to select in a single query during a TTL job
sql.ttl.job.enabled boolean true whether the TTL job is enabled
sql.txn_fingerprint_id_cache.capacity integer 100 the maximum number of txn fingerprint IDs stored
Expand Down
1 change: 0 additions & 1 deletion docs/generated/settings/settings.html
Original file line number Diff line number Diff line change
Expand Up @@ -218,7 +218,6 @@
<tr><td><code>sql.trace.txn.enable_threshold</code></td><td>duration</td><td><code>0s</code></td><td>enables tracing on all transactions; transactions open for longer than this duration will have their trace logged (set to 0 to disable); note that enabling this may have a negative performance impact; this setting is coarser-grained than sql.trace.stmt.enable_threshold because it applies to all statements within a transaction as well as client communication (e.g. retries)</td></tr>
<tr><td><code>sql.ttl.default_delete_batch_size</code></td><td>integer</td><td><code>100</code></td><td>default amount of rows to delete in a single query during a TTL job</td></tr>
<tr><td><code>sql.ttl.default_delete_rate_limit</code></td><td>integer</td><td><code>0</code></td><td>default delete rate limit for all TTL jobs. Use 0 to signify no rate limit.</td></tr>
<tr><td><code>sql.ttl.default_range_concurrency</code></td><td>integer</td><td><code>1</code></td><td>default amount of ranges to process at once during a TTL delete</td></tr>
<tr><td><code>sql.ttl.default_select_batch_size</code></td><td>integer</td><td><code>500</code></td><td>default amount of rows to select in a single query during a TTL job</td></tr>
<tr><td><code>sql.ttl.job.enabled</code></td><td>boolean</td><td><code>true</code></td><td>whether the TTL job is enabled</td></tr>
<tr><td><code>sql.txn_fingerprint_id_cache.capacity</code></td><td>integer</td><td><code>100</code></td><td>the maximum number of txn fingerprint IDs stored</td></tr>
Expand Down
12 changes: 12 additions & 0 deletions pkg/jobs/jobspb/jobs.proto
Original file line number Diff line number Diff line change
Expand Up @@ -1015,12 +1015,18 @@ message RowLevelTTLDetails {
}

message RowLevelTTLProgress {

// JobRowCount is the number of deleted rows for the entire TTL job.
int64 job_row_count = 1;

// ProcessorProgresses is the progress per DistSQL processor.
repeated RowLevelTTLProcessorProgress processor_progresses = 2 [(gogoproto.nullable)=false];

// UseDistSQL is true if the TTL job distributed the work to DistSQL processors (requires cluster v22.2).
bool use_dist_sql = 3 [(gogoproto.customname) = "UseDistSQL"];

// JobSpanCount is the number of spans for the entire TTL job.
int64 job_span_count = 4;
}

message RowLevelTTLProcessorProgress {
Expand All @@ -1037,6 +1043,12 @@ message RowLevelTTLProcessorProgress {

// ProcessorRowCount is the row count of the DistSQL processor.
int64 processor_row_count = 3;

// ProcessorSpanCount is the number of spans of the DistSQL processor;
int64 processor_span_count = 4;

// ProcessorConcurrency is the number parallel tasks the processor will do at once.
int64 processor_concurrency = 5;
}

message SchemaTelemetryDetails {
Expand Down
3 changes: 3 additions & 0 deletions pkg/settings/registry.go
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,9 @@ var retiredSettings = map[string]struct{}{
"kv.refresh_range.time_bound_iterators.enabled": {},
"sql.defaults.datestyle.enabled": {},
"sql.defaults.intervalstyle.enabled": {},

// removed as of 22.2.1
"sql.ttl.default_range_concurrency": {},
}

// sqlDefaultSettings is the list of "grandfathered" existing sql.defaults
Expand Down
4 changes: 2 additions & 2 deletions pkg/sql/catalog/catpb/catalog.proto
Original file line number Diff line number Diff line change
Expand Up @@ -204,8 +204,8 @@ message RowLevelTTL {
optional string deletion_cron = 4 [(gogoproto.nullable)=false];
// ScheduleID is the ID of the row-level TTL job schedules.
optional int64 schedule_id = 5 [(gogoproto.customname)="ScheduleID",(gogoproto.nullable)=false];
// RangeConcurrency is the number of ranges to process at a time.
optional int64 range_concurrency = 6 [(gogoproto.nullable)=false];
// RangeConcurrency is based on the number of spans and is no longer configurable.
reserved 6;
// DeleteRateLimit is the maximum amount of rows to delete per second.
optional int64 delete_rate_limit = 7 [(gogoproto.nullable)=false];
// Pause is set if the TTL job should not run.
Expand Down
3 changes: 0 additions & 3 deletions pkg/sql/catalog/tabledesc/structured.go
Original file line number Diff line number Diff line change
Expand Up @@ -2622,9 +2622,6 @@ func (desc *wrapper) GetStorageParams(spaceBetweenEqual bool) []string {
if bs := ttl.DeleteBatchSize; bs != 0 {
appendStorageParam(`ttl_delete_batch_size`, fmt.Sprintf(`%d`, bs))
}
if rc := ttl.RangeConcurrency; rc != 0 {
appendStorageParam(`ttl_range_concurrency`, fmt.Sprintf(`%d`, rc))
}
if rl := ttl.DeleteRateLimit; rl != 0 {
appendStorageParam(`ttl_delete_rate_limit`, fmt.Sprintf(`%d`, rl))
}
Expand Down
17 changes: 0 additions & 17 deletions pkg/sql/catalog/tabledesc/ttl.go
Original file line number Diff line number Diff line change
Expand Up @@ -51,11 +51,6 @@ func ValidateRowLevelTTL(ttl *catpb.RowLevelTTL) error {
return err
}
}
if ttl.RangeConcurrency != 0 {
if err := ValidateTTLRangeConcurrency("ttl_range_concurrency", ttl.RangeConcurrency); err != nil {
return err
}
}
if ttl.DeleteRateLimit != 0 {
if err := ValidateTTLRateLimit("ttl_delete_rate_limit", ttl.DeleteRateLimit); err != nil {
return err
Expand Down Expand Up @@ -155,18 +150,6 @@ func ValidateTTLBatchSize(key string, val int64) error {
return nil
}

// ValidateTTLRangeConcurrency validates the batch size of a TTL.
func ValidateTTLRangeConcurrency(key string, val int64) error {
if val <= 0 {
return pgerror.Newf(
pgcode.InvalidParameterValue,
`"%s" must be at least 1`,
key,
)
}
return nil
}

// ValidateTTLCronExpr validates the cron expression of TTL.
func ValidateTTLCronExpr(key string, str string) error {
if _, err := cron.ParseStandard(str); err != nil {
Expand Down
5 changes: 2 additions & 3 deletions pkg/sql/execinfrapb/processors_ttl.proto
Original file line number Diff line number Diff line change
Expand Up @@ -59,9 +59,8 @@ message TTLSpec {
// flow.
repeated roachpb.Span spans = 5 [(gogoproto.nullable) = false];

// RangeConcurrency controls how many ranges a single ttlProcessor processes
// in parallel.
optional int64 range_concurrency = 6 [(gogoproto.nullable) = false];
// RangeConcurrency is based on the number of spans and is no longer configurable.
reserved 6;

// SelectBatchSize controls the batch size for SELECTs.
optional int64 select_batch_size = 7 [(gogoproto.nullable) = false];
Expand Down
33 changes: 24 additions & 9 deletions pkg/sql/logictest/testdata/logic_test/row_level_ttl
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ CREATE TABLE tbl (id INT PRIMARY KEY, text TEXT) WITH (ttl = 'on')

subtest end

subtest todo_add_subtests
subtest ttl_automatic_column_notice

query T noticetrace
CREATE TABLE tbl_ttl_automatic_column (id INT PRIMARY KEY, text TEXT) WITH (ttl_automatic_column = 'on')
Expand All @@ -52,6 +52,24 @@ ALTER TABLE tbl_ttl_automatic_column RESET (ttl_automatic_column)
----
NOTICE: ttl_automatic_column is no longer used. Setting ttl_expire_after automatically creates a TTL column. Resetting ttl_expire_after removes the automatically created column.

subtest end

subtest ttl_range_concurrency_notice

query T noticetrace
CREATE TABLE tbl_ttl_range_concurrency (id INT PRIMARY KEY, text TEXT) WITH (ttl_range_concurrency = 2)
----
NOTICE: ttl_range_concurrency is no longer configurable.

query T noticetrace
ALTER TABLE tbl_ttl_range_concurrency RESET (ttl_range_concurrency)
----
NOTICE: ttl_range_concurrency is no longer configurable.

subtest end

subtest todo_add_subtests

statement error expected DEFAULT expression of crdb_internal_expiration to be current_timestamp\(\):::TIMESTAMPTZ \+ '00:10:00':::INTERVAL
CREATE TABLE tbl (
id INT PRIMARY KEY,
Expand Down Expand Up @@ -432,12 +450,12 @@ CREATE TABLE tbl (
id INT PRIMARY KEY,
text TEXT,
FAMILY (id, text)
) WITH (ttl_expire_after = '10 minutes', ttl_select_batch_size = 50, ttl_range_concurrency = 2, ttl_delete_rate_limit = 100, ttl_pause = true, ttl_row_stats_poll_interval = '1 minute', ttl_label_metrics = true)
) WITH (ttl_expire_after = '10 minutes', ttl_select_batch_size = 50, ttl_delete_rate_limit = 100, ttl_pause = true, ttl_row_stats_poll_interval = '1 minute', ttl_label_metrics = true)

query T
SELECT reloptions FROM pg_class WHERE relname = 'tbl'
----
{ttl='on',ttl_expire_after='00:10:00':::INTERVAL,ttl_job_cron='@hourly',ttl_select_batch_size=50,ttl_range_concurrency=2,ttl_delete_rate_limit=100,ttl_pause=true,ttl_row_stats_poll_interval='1m0s',ttl_label_metrics=true}
{ttl='on',ttl_expire_after='00:10:00':::INTERVAL,ttl_job_cron='@hourly',ttl_select_batch_size=50,ttl_delete_rate_limit=100,ttl_pause=true,ttl_row_stats_poll_interval='1m0s',ttl_label_metrics=true}

query T
SELECT create_statement FROM [SHOW CREATE TABLE tbl]
Expand All @@ -448,7 +466,7 @@ CREATE TABLE public.tbl (
crdb_internal_expiration TIMESTAMPTZ NOT VISIBLE NOT NULL DEFAULT current_timestamp():::TIMESTAMPTZ + '00:10:00':::INTERVAL ON UPDATE current_timestamp():::TIMESTAMPTZ + '00:10:00':::INTERVAL,
CONSTRAINT tbl_pkey PRIMARY KEY (id ASC),
FAMILY fam_0_id_text_crdb_internal_expiration (id, text, crdb_internal_expiration)
) WITH (ttl = 'on', ttl_expire_after = '00:10:00':::INTERVAL, ttl_job_cron = '@hourly', ttl_select_batch_size = 50, ttl_range_concurrency = 2, ttl_delete_rate_limit = 100, ttl_pause = true, ttl_row_stats_poll_interval = '1m0s', ttl_label_metrics = true)
) WITH (ttl = 'on', ttl_expire_after = '00:10:00':::INTERVAL, ttl_job_cron = '@hourly', ttl_select_batch_size = 50, ttl_delete_rate_limit = 100, ttl_pause = true, ttl_row_stats_poll_interval = '1m0s', ttl_label_metrics = true)

statement ok
ALTER TABLE tbl SET (ttl_delete_batch_size = 100)
Expand All @@ -462,25 +480,22 @@ CREATE TABLE public.tbl (
crdb_internal_expiration TIMESTAMPTZ NOT VISIBLE NOT NULL DEFAULT current_timestamp():::TIMESTAMPTZ + '00:10:00':::INTERVAL ON UPDATE current_timestamp():::TIMESTAMPTZ + '00:10:00':::INTERVAL,
CONSTRAINT tbl_pkey PRIMARY KEY (id ASC),
FAMILY fam_0_id_text_crdb_internal_expiration (id, text, crdb_internal_expiration)
) WITH (ttl = 'on', ttl_expire_after = '00:10:00':::INTERVAL, ttl_job_cron = '@hourly', ttl_select_batch_size = 50, ttl_delete_batch_size = 100, ttl_range_concurrency = 2, ttl_delete_rate_limit = 100, ttl_pause = true, ttl_row_stats_poll_interval = '1m0s', ttl_label_metrics = true)
) WITH (ttl = 'on', ttl_expire_after = '00:10:00':::INTERVAL, ttl_job_cron = '@hourly', ttl_select_batch_size = 50, ttl_delete_batch_size = 100, ttl_delete_rate_limit = 100, ttl_pause = true, ttl_row_stats_poll_interval = '1m0s', ttl_label_metrics = true)

statement error "ttl_select_batch_size" must be at least 1
ALTER TABLE tbl SET (ttl_select_batch_size = -1)

statement error "ttl_delete_batch_size" must be at least 1
ALTER TABLE tbl SET (ttl_delete_batch_size = -1)

statement error "ttl_range_concurrency" must be at least 1
ALTER TABLE tbl SET (ttl_range_concurrency = -1)

statement error "ttl_delete_rate_limit" must be at least 1
ALTER TABLE tbl SET (ttl_delete_rate_limit = -1)

statement error "ttl_row_stats_poll_interval" must be at least 1
ALTER TABLE tbl SET (ttl_row_stats_poll_interval = '-1 second')

statement ok
ALTER TABLE tbl RESET (ttl_delete_batch_size, ttl_select_batch_size, ttl_range_concurrency, ttl_delete_rate_limit, ttl_pause, ttl_row_stats_poll_interval)
ALTER TABLE tbl RESET (ttl_delete_batch_size, ttl_select_batch_size, ttl_delete_rate_limit, ttl_pause, ttl_row_stats_poll_interval)

query T
SELECT create_statement FROM [SHOW CREATE TABLE tbl]
Expand Down
17 changes: 5 additions & 12 deletions pkg/sql/storageparam/tablestorageparam/table_storage_param.go
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,8 @@ var ttlAutomaticColumnNotice = pgnotice.Newf("ttl_automatic_column is no longer
"Setting ttl_expire_after automatically creates a TTL column. " +
"Resetting ttl_expire_after removes the automatically created column.")

var ttlRangeConcurrencyNotice = pgnotice.Newf("ttl_range_concurrency is no longer configurable.")

var tableParams = map[string]tableParam{
`fillfactor`: {
onSet: func(po *Setter, semaCtx *tree.SemaContext, evalCtx *eval.Context, key string, datum tree.Datum) error {
Expand Down Expand Up @@ -307,23 +309,14 @@ var tableParams = map[string]tableParam{
return nil
},
},
// todo(wall): remove in 23.1
`ttl_range_concurrency`: {
onSet: func(po *Setter, semaCtx *tree.SemaContext, evalCtx *eval.Context, key string, datum tree.Datum) error {
val, err := paramparse.DatumAsInt(evalCtx, key, datum)
if err != nil {
return err
}
if err := tabledesc.ValidateTTLRangeConcurrency(key, val); err != nil {
return err
}
rowLevelTTL := po.getOrCreateRowLevelTTL()
rowLevelTTL.RangeConcurrency = val
evalCtx.ClientNoticeSender.BufferClientNotice(evalCtx.Context, ttlRangeConcurrencyNotice)
return nil
},
onReset: func(po *Setter, evalCtx *eval.Context, key string) error {
if po.hasRowLevelTTL() {
po.UpdatedRowLevelTTL.RangeConcurrency = 0
}
evalCtx.ClientNoticeSender.BufferClientNotice(evalCtx.Context, ttlRangeConcurrencyNotice)
return nil
},
},
Expand Down
Loading

0 comments on commit 83a7b10

Please sign in to comment.