Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using partitioning for OutboxRecord cleanup #319

Open
ramonsmits opened this issue Jan 3, 2018 · 1 comment
Open

Consider using partitioning for OutboxRecord cleanup #319

ramonsmits opened this issue Jan 3, 2018 · 1 comment

Comments

@ramonsmits
Copy link
Member

Source: https://groups.google.com/forum/#!topic/particularsoftware/CaZMSSti6BE

TL;DR

Some background

Our OutboxRecord table is stored in SQL Azure. We have a high volume of traffic and we are seeing a large amount of database contention with selects, updates, and deletes. The default installation of OutboxRecord is a table backed by a single database partition. We're running a separate automation job to handle cleanup of the OutboxRecord table because the default delete logic is nearly always terminated by SQL Azure to avoid deadlocks of the other threads. We have to delete from the table in smaller batches to avoid SQL terminating the process.

A possible solution for this is to partition the OutboxRecord table on the DispatchedAt field. We want to retain dispatched records for a period of time after which we can delete them. This allows us to have a number of partitions, based on DispatchedAt, which allows us to drop entire partitions once we no longer need that data. This avoids locks, but does change the design of OutboxRecord. I have a working (in test) solution, but I had to modify the OutboxRecord table's indexes to make this work.

Changes I had to make

OutboxRecord has a clustered index on the ID field, but this field doesn't appear to be used by any queries. MessageId is used extensively, so I dropped all indexes, created my partition function, and then created a new clustered index on MessageId and DispatchedAt. I also created unclustered indexes on Dispatched and DispatchedAt. I did not create an index for the ID field as it didn't seem to be used.

And the question

Are there any issues we may encounter with not having an index on ID (not to be confused with MessageId which is not part of a clustered index whereas before it was an unclustered index)? I didn't see where ID was used, so I think I'm safe in removing the clustered index on that field, but wanted to check.

@ramonsmits
Copy link
Member Author

It doesn't seem that this is 'easy' to do. The CREATE PARTITION FUNCTION only seems an range of values set that must be specified upfront.

One thing that we can do is add guidance on how to run the cleanup frequently on the database itself as a task instead of having it run from the instance itself which is potentially scaled out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant