hir-94: actually reschedule queue entries when an account is over quota#18
hir-94: actually reschedule queue entries when an account is over quota#18jaredzwick wants to merge 1 commit into
Conversation
The quota-exceeded branch in `processEmailQueue` claimed to "reschedule for quota reset time" but the underlying `updateQueueStatus` call only touched `status` and `errorMessage`. `scheduledFor` stayed at its original value. The next batch picked the same entry again — because `getNextPendingEmails` filters by `scheduledFor <= now` — wasted a slot re-running the unsubscribe + quota checks, and skipped it again. Under sustained load with one over-quota account, the processor would fill its entire batch every iteration with the same N entries and never make forward progress on other accounts' work. This change: - libs/db/src/queries/emailQueue.ts: new `rescheduleQueueEntry(id, newScheduledFor, errorMessage?)` helper. Keeps status pending, bumps `scheduledFor`, sets `updatedAt`. Does NOT increment `attemptCount` because no send was attempted. - src/lib/queueScheduling.ts (new): pure `computeQuotaRescheduleAt( quotaResetAt, now?, minDelayMs?)` — picks the future moment to retry. Honors the account's stored quota reset when it's far enough out; floors at `now + minDelay` (default 1 minute) when the reset is null, in the past, equal to now, or sooner than the floor. Always returns a Date >= now so we never schedule into the past. - src/lib/emailQueueProcessor.ts: quota-exceeded branch now calls `rescheduleQueueEntry` with the computed Date instead of `updateQueueStatus`. Also drops the previous "reset is in the past -> fall through and try to send" path: a fresh quota window resetting is the email-account row's responsibility, not the processor's, and the previous fall-through would attempt a send against an account whose `quotaUsedToday` was still at the limit. Safer to wait for the reset to be applied. Tests: 12 new vitest specs covering future-reset honored, null/undefined fallback, past-reset floor, equal-to-now floor, sub-floor reset clamped, just-past-floor reset honored, custom minDelay, negative minDelay clamped to zero, fresh-Date guarantee, default-now omission, "never earlier than now" property across a parametric set of inputs. 107/108 tests pass in the full int suite (the 1 failure is the pre-existing Payload-secret config issue on `main`, unrelated). Lint clean for changed files. Co-Authored-By: Paperclip <noreply@paperclip.ing>
jaredzwick
left a comment
There was a problem hiding this comment.
CTO Review — PR #18 (hir-94: quota reschedule fix)
Verdict: LGTM — this is a real bug fix
The bug: processEmailQueue quota-exceeded path updated status and errorMessage but left scheduledFor unchanged. Since getNextPendingEmails filters scheduledFor <= now, the same entry was re-picked every batch, re-checked, and re-skipped in a tight loop. This silently burns quota-check cycles and blocks the queue.
The fix correctly sets scheduledFor to the quota reset time, so the entry won't re-surface until the account's quota window clears. Ready to merge.
jaredzwick
left a comment
There was a problem hiding this comment.
CTO Review — LGTM
Core fix is correct and well-scoped.
Bug fixed: Old path called updateQueueStatus(id, 'pending', null, 'Quota exceeded') which kept status pending but never bumped scheduledFor. Since getNextPendingEmails filters scheduledFor <= now, the entry was re-picked on every batch iteration and spun forever. New path calls rescheduleQueueEntry which actually advances scheduledFor.
rescheduleQueueEntry (db layer): Clean separation from the existing updateQueueStatus. Correctly does NOT touch attemptCount — no send was attempted, only a scheduling decision was made. The conditional errorMessage update pattern avoids clobbering an existing message with null unless the caller explicitly passes undefined — good.
computeQuotaRescheduleAt (pure function): Logic is correct. Floor of now + minDelayMs prevents a tight loop when quotaResetAt is null, past, or too close. Math.max(minDelayMs, 0) handles the negative-delay edge case. Returns a fresh Date instance — safe for callers that mutate.
Test suite: 11 cases, covers all boundary conditions. Particularly good: the invariant test at the end that asserts output >= now for every possible input class.
No concerns. Ready to merge after PR #13, #20, #22 land (same base, no conflicts expected — these are independent feature slices).
Summary
The quota-exceeded branch in
processEmailQueueclaimed to "reschedule for quota reset time" but the underlyingupdateQueueStatuscall only touchedstatusanderrorMessage—scheduledForstayed at its original value. SincegetNextPendingEmailsfilters byscheduledFor <= now, the same entry got re-picked on every batch, re-checked unsubscribe + quota, and re-skipped. Under sustained load with one over-quota account, the processor's entire batch fills with the same N entries every iteration and other accounts' work starves.Changes
libs/db/src/queries/emailQueue.ts: newrescheduleQueueEntry(id, newScheduledFor, errorMessage?)helper. Keeps statuspending, bumpsscheduledFor, setsupdatedAt. Does NOT incrementattemptCountbecause no send was attempted.src/lib/queueScheduling.ts(new): purecomputeQuotaRescheduleAt(quotaResetAt, now?, minDelayMs?). Honors the account's stored quota reset when it's far enough out; floors atnow + minDelay(default 1 minute) when the reset is null, in the past, equal to now, or sooner than the floor. Always returns a Date>= nowso we never schedule backwards.src/lib/emailQueueProcessor.ts: quota-exceeded branch callsrescheduleQueueEntrywith the computed Date. Also drops the previous "reset is in the past → fall through and try to send" path. A fresh quota window is the account row's responsibility; falling through would attempt a send against an account whosequotaUsedTodayis still at the limit. Safer to wait for the reset to be applied.Tests
tests/int/queueScheduling.int.spec.ts— 12 specs: future reset honored; null/undefined fallback; past-reset floor; equal-to-now floor; sub-floor reset clamped to floor; just-past-floor reset honored; customminDelayMs; negativeminDelayMsclamped to zero; fresh-Date guarantee (output is not the input reference); default-nowomission; "output >= now" property across a parametric input set.pnpm test:int: 107/108 pass; the 1 failure is the pre-existingapi.int.spec.tsPayload-secret issue onmain, unrelated.pnpm lintclean for all changed files.Regression analysis
updateQueueStatusis still used everywhere it was before. The newrescheduleQueueEntryis additive.quotaUsedTodayhad not been reset, which Gmail would reject and the entry would then count as a real failure (incrementingattemptCounttoward the max-attempts cap). The new behavior — wait for the reset — is strictly safer.Test plan
dailyQuota: 1andquotaUsedToday: 1in the DB)./api/email-queue/processand confirm: the 5 entries on account feat: email template library + /templates page + Browse picker (HIR-75) #2 send; the 10 entries on the over-quota account getscheduledForbumped to the account'squotaResetAt, and the next process call no longer re-picks them.quotaUsedToday: 0andquotaResetAt: now() + 1 hour, set the queue entry'sscheduledFor: now() + 30 seconds, and confirm a process call right now does not re-pick it.Cumulative HIR-94 progress
🤖 Generated with Claude Code