Outbox cleanup in scaled out environments requires advanced configuration. Cleanup should be adaptive and elastic scaleout friendly by default #987

ramonsmits · 2022-08-08T10:45:00Z

Related to : Deadlocks/timeouts occur on SQL outbox table cleanup if it is under high load #778

By default instances will be competing for the outbox cleanup task. If for instances the endpoint is scaled out to 5 instances then the cleanup task will be run 5 times per minute.

Cleanup active/passive via leader election

It would be nice if via a sort of leader selection the cleanup will be ran on only a single node and/or that cleanup is adaptive/dynamic. Meaning, on very low volume endpoints

For example, have a table that contains a lease for that endpoint outbox cleanup. The lease could be for example 10 minutes. The lease owner would extent the lease every 5 minutes. Other instance should try to renew at the end of the lease but will fail if the active instance already renewed. If the active instance dies gracefully it can DELETE the lease record, if it dies ungracefully any of the other passive instance will obtain the lease they try to update the query

Install native cleanup job

An alternative would be that during installation a native cleanup job is scheduled and that an endpoint instance can detect if this job is ran frequently.

bbrandt · 2024-06-25T03:30:20Z

This explains a lot. I ran a load test experiment today, turning on Outbox for the first time, using between 20 and 40 nodes for 3 different services. (Azure Service Bus transport and SQL persistence.) I noticed Outbox seemed to add about 5 seconds per handler and hadn't had time to dig into why (or call sp_blitzWho and all that). If the cleanup is running 20-40 times every 5 min and the DispatechAt column is not indexed (#1343) that could be a contributor to the poor throughput I saw.

danielmarbach added the Improvement label Jun 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Outbox cleanup in scaled out environments requires advanced configuration. Cleanup should be adaptive and elastic scaleout friendly by default #987

Outbox cleanup in scaled out environments requires advanced configuration. Cleanup should be adaptive and elastic scaleout friendly by default #987

ramonsmits commented Aug 8, 2022 •

edited

Loading

bbrandt commented Jun 25, 2024

Outbox cleanup in scaled out environments requires advanced configuration. Cleanup should be adaptive and elastic scaleout friendly by default #987

Outbox cleanup in scaled out environments requires advanced configuration. Cleanup should be adaptive and elastic scaleout friendly by default #987

Comments

ramonsmits commented Aug 8, 2022 • edited Loading

Cleanup active/passive via leader election

Install native cleanup job

bbrandt commented Jun 25, 2024

ramonsmits commented Aug 8, 2022 •

edited

Loading