Problem
Today many independent queueables fire concurrently, each ending in its own DML against children of a shared parent (lookup / M-D). Salesforce locks the parent on every child write → UNABLE_TO_LOCK_ROW. Lookup/ownership skew amplifies. Retry (Issue #N) is a safety net but doesn't address root cause: N small races for the same parent lock instead of one batched commit.
Proposal
Treat this as a design spike — not a green-lit implementation. Explore whether the Async-lib can intercept DMLs at the end of independent queueables and coalesce them into a single batched commit within a time window (or other grouping signal).
The producer side should require zero changes to existing queueables — devs already wrote MyJob ending in update records;. The framework should be able to opt that job into "deferred DML" mode without rewriting it.
// Sketch — devs don't redesign their jobs, they just opt in
Async.queueable(new MyJob(args))
.deferDmlVia(AccountDmlCoalescer.class)
.enqueue();
The coalescer collects pending DML requests from N producer jobs within a configurable window, sorts by parent ID, deduplicates, commits once.
Mitigations to evaluate FIRST (cheaper, ship before spike)
Before building coalescing infra, prove these aren't enough:
- Sort by parent ID before DML — kills most lock contention in practice. Could be a
.sortByParent(Field) helper in DML-lib.
- Partial commit + retry failed subset —
Database.update(records, false) + retry the locked rows. Pairs naturally with Issue #N.
.dedupe() option on individual queueables — drop duplicate work in same job.
If 1+2+3 still leaves measurable contention, the coalescer is justified.
Open Questions
- Cross-transaction queue mechanism — Async-lib internal state, Platform Event, custom object queue, Platform Cache? Open to options; PE was one idea, not a requirement.
- Grouping key — per
SObjectType, per parent ID, per coalescer class, per business window?
- Time/size window — fixed delay, max batch size, both?
- Single-runner enforcement — how do we guarantee only one coalescer flushes a given group at a time?
- Failure semantics — if coalesced DML partially fails, how do we attribute per-producer logs back?
- Interplay with retry framework — does retry happen at producer level or coalescer level?
Acceptance Criteria (for the spike)
Problem
Today many independent queueables fire concurrently, each ending in its own DML against children of a shared parent (lookup / M-D). Salesforce locks the parent on every child write →
UNABLE_TO_LOCK_ROW. Lookup/ownership skew amplifies. Retry (Issue #N) is a safety net but doesn't address root cause: N small races for the same parent lock instead of one batched commit.Proposal
Treat this as a design spike — not a green-lit implementation. Explore whether the Async-lib can intercept DMLs at the end of independent queueables and coalesce them into a single batched commit within a time window (or other grouping signal).
The producer side should require zero changes to existing queueables — devs already wrote
MyJobending inupdate records;. The framework should be able to opt that job into "deferred DML" mode without rewriting it.The coalescer collects pending DML requests from N producer jobs within a configurable window, sorts by parent ID, deduplicates, commits once.
Mitigations to evaluate FIRST (cheaper, ship before spike)
Before building coalescing infra, prove these aren't enough:
.sortByParent(Field)helper in DML-lib.Database.update(records, false)+ retry the locked rows. Pairs naturally with Issue #N..dedupe()option on individual queueables — drop duplicate work in same job.If 1+2+3 still leaves measurable contention, the coalescer is justified.
Open Questions
SObjectType, per parent ID, per coalescer class, per business window?Acceptance Criteria (for the spike)