hir-124: dedupe replyTriage + createReply on duplicate webhook delivery#16
Conversation
Gate the warm-reply triage path on the same alreadyReplied check the event-write already had, so a redelivered Gmail Pub/Sub (or repeated IMAP poll) hit doesn't double-write the reply row, double-bill the LLM triage call, or cancel a still-scheduled silent follow-up twice. Adds a unique index on reply (contact_id, event_id) as a belt-and-braces DB-layer invariant — Postgres treats NULL as distinct so legacy rows with a null event_id won't collide. Test additions: a two-delivery test that asserts triageReply, createReply, createEmailEvent, createReplyFollowup, and incrementCampaignStat each fire exactly once across the two calls; and rewrites the existing "alreadyReplied" test to match the new duplicate_delivery short-circuit (the old test was implicitly conflating duplicate webhooks with genuine new replies, which is the bug HIR-124 is fixing).
jaredzwick
left a comment
There was a problem hiding this comment.
CTO Review — PR #16 (hir-124: dedupe replyTriage + createReply)
Verdict: Approve (pending #14 merge before landing)
Short-circuit logic — correct. Moving the alreadyReplied check to an early return before createEmailEvent / incrementCampaignStat / triageReply / createReply / createReplyFollowup is the right fix. The old code's "alreadyReplied" path was a misnomer — it only skipped the event write and stat bump but still ran the full triage pipeline, which is the bug HIR-124 is fixing. The rewrite of that test is therefore correct, not a regression.
Schema / migration — CREATE UNIQUE INDEX (contact_id, event_id) with Postgres NULL-distinct semantics is safe. Legacy rows with event_id IS NULL will not collide. Migration file name matches drizzle journal. Good belt-and-braces.
eventId simplification — let eventId: string | undefined → const eventId = nanoid() after the early-return gate is cleaner and removes the downstream ?? null crutch. Type is now correct throughout.
Test coverage — the two-delivery test with exact call-count assertions (toHaveBeenCalledTimes(1) across all 5 side effects) is the right way to prove dedup. Nothing I'd add.
One flag to carry forward (not a blocker): As noted in the PR body, a genuine second reply from the same prospect on the same tracking_id will also short-circuit. That's the correct v1 trade-off (founder sees it in Gmail). When we want multi-turn, the dedup key needs to be the Gmail message-id, not the tracking-id. Worth a TODO comment in recordInboundReply near the eventExistsForTracking call so the next engineer doesn't have to rediscover this.
Landing plan: This is stacked on hir-120/warm-reply-triage (#14). Merge #14 first, rebase this branch, then land. No changes needed from my side — LGTM as-is.
|
CTO review: LGTM — correct short-circuit on duplicate delivery. Unique index on |
Summary
recordInboundReplyonalreadyRepliedso a redelivered Gmail Pub/Sub (or repeated IMAP poll) hit doesn't double-write thereplyrow, double-bill the LLM triage call, double-cancel a still-scheduled silent follow-up, or duplicate cards in/dashboard/replies. Returnsreason: "duplicate_delivery"so callers can log the dedup explicitly.reply (contact_id, event_id)(drizzle migration0009_damp_mulholland_black.sql) as belt-and-braces in case theeventExistsForTrackingcheck ever races. Postgres treats NULL as distinct, so legacy rows with a nullevent_idwon't collide.triageReply,createReply,createEmailEvent,createReplyFollowup, andincrementCampaignStateach fire exactly once across the two calls. Rewrote the existing "alreadyReplied" test to match the new short-circuit semantics — the old test was implicitly conflating duplicate webhooks with genuine new replies, which is the bug HIR-124 is fixing.Why this branches off PR #14
This is the follow-up CTO review surfaced on HIR-122 / PR #14. Branching off
hir-120/warm-reply-triageso it can land in the same week as v1 of the warm-reply UI without sitting behind a main merge.Behavior note worth a CTO eye
With this change, a genuine second reply from the same prospect against the same outbound (same
tracking_id) will also short-circuit — v1 of the warm-reply UI doesn't model multi-turn replies. That's the correct trade for v1 (the founder will almost always notice the new reply via Gmail UI), but if we want multi-turn we'll need a different dedup key (e.g. message-id, not tracking-id).Test plan
pnpm vitest run tests/int/replyFollowup.int.spec.ts— 14/14 passpnpm tsc --noEmit -p tsconfig.json— cleanpnpm vitest run— 110/110 pass (the one failed file,api.int.spec.ts, is a pre-existing Payload secret-key env issue on the base branch, not introduced here — verified by re-running it from base)pnpm db:migrateagainst staging to apply0009_damp_mulholland_black.sql