Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds TTL support to SQL server #3277

Merged
merged 3 commits into from
Mar 29, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,10 @@ spec:
value: <SCHEMA> # Optional. defaults to "dbo"
- name: indexedProperties
value: <INDEXED-PROPERTIES> # Optional. List of IndexedProperties.
- name: metadataTableName # Optional. Name of the table where to store metadata used by Dapr
value: "dapr_metadata"
- name: cleanupIntervalInSeconds # Optional. Cleanup interval in seconds, to remove expired rows
value: 300

```

Expand All @@ -58,6 +62,8 @@ If you wish to use SQL server as an [actor state store]({{< ref "state_api.md#co
| schema | N | The schema to use. Defaults to `"dbo"` | `"dapr"`,`"dbo"`
| indexedProperties | N | List of IndexedProperties. | `'[{"column": "transactionid", "property": "id", "type": "int"}, {"column": "customerid", "property": "customer", "type": "nvarchar(100)"}]'`
| actorStateStore | N | Indicates that Dapr should configure this component for the actor state store ([more information]({{< ref "state_api.md#configuring-state-store-for-actors" >}})). | `"true"`
| metadataTableName | N | Name of the table Dapr uses to store a few metadata properties. Defaults to `dapr_metadata`. | `"dapr_metadata"`
| cleanupIntervalInSeconds | N | Interval, in seconds, to clean up rows with an expired TTL. Default: `3600` (i.e. 1 hour). Setting this to values <=0 disables the periodic cleanup. | `1800`, `-1`


## Create Azure SQL instance
Expand All @@ -80,6 +86,23 @@ When connecting with a dedicated user (not `sa`), these authorizations are requi
- `CREATE TABLE`
- `CREATE TYPE`

### TTLs and cleanups

This state store supports [Time-To-Live (TTL)]({{< ref state-store-ttl.md >}}) for records stored with Dapr. When storing data using Dapr, you can set the `ttlInSeconds` metadata property to indicate after how many seconds the data should be considered "expired".

Because SQL Server doesn't have built-in support for TTLs, Dapr implements this by adding a column in the state table indicating when the data should be considered "expired". "Expired" records are not returned to the caller, even if they're still physically stored in the database. A background "garbage collector" periodically scans the state table for expired rows and deletes them.

You can set the interval for the deletion of expired records with the `cleanupIntervalInSeconds` metadata property, which defaults to 3600 seconds (that is, 1 hour).

- Longer intervals require less frequent scans for expired rows, but can require storing expired records for longer, potentially requiring more storage space. If you plan to store many records in your state table, with short TTLs, consider setting `cleanupIntervalInSeconds` to a smaller value - for example, `300` (300 seconds, or 5 minutes).
- If you do not plan to use TTLs with Dapr and the SQL Server state store, you should consider setting `cleanupIntervalInSeconds` to a value <= 0 (e.g. `0` or `-1`) to disable the periodic cleanup and reduce the load on the database.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add a note saying that the column doesn't have an index, and the consequences of that (garbage collectors perform full-table scans).

Whether an index is required or not depends on the use case. A table with lots of rows (hundreds of thousands) where only some have a TTL could benefit from an index as FTS in this case would be expensive and would remove few rows only. A table with not many rows and/or where most rows have a TTL (like, tables used for caching) would probably benefit from NOT having an index. An index makes queries faster but requires more memory, more storage, and (slightly) slows down inserts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ItalyPaleAle!

I disagree with this statement though:
and/or where most rows have a TTL (like, tables used for caching) would probably benefit from NOT having an index

Surely an index is beneficial because the index orders the ExpireDate and therefore exits sooner when performing the < condition on the scan?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhmmm it may really depend on how many rows are in the table, and how many need to be deleted on the iteration. I guess we could benchmark that, but I don't care enough 😅 Point is, users may choose to add an index, or not, depending on their specific patterns.

The state store does not have an index on the `ExpireDate` column, which means that each clean up operation must perform a full table scan. If you intend to write to the table with a large number of records that use TTLs, you should consider creating an index on the `ExpireDate` column. An index makes queries faster, but uses more storage space and slightly slows down writes.

```sql
CREATE CLUSTERED INDEX expiredate_idx ON state(ExpireDate ASC)
```

## Related links
- [Basic schema for a Dapr component]({{< ref component-schema >}})
- Read [this guide]({{< ref "howto-get-save-state.md#step-2-save-and-retrieve-a-single-state" >}}) for instructions on configuring state store components
Expand Down