-
Notifications
You must be signed in to change notification settings - Fork 145
fix: reconcile state after msg resend #3613
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughThis change adjusts delivery status priority ordering in the sorting function and refactors webhook event reconciliation to improve error handling, variable scoping, and logging. The webhook changes introduce structured logging keys, separate error variable scoping in retry paths, and return validation errors for payload serialization failures. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
openmeter/notification/eventhandler/deliverystatus.go (1)
60-66: Updated priority map keeps RESENDING between PENDING and SENDINGThe new priorities (PENDING=0, RESENDING=1, SENDING=2, FAILED/SUCCESS=3) line up nicely with how
reconcileWebhookEventprocesses active statuses — freshPENDINGfirst, thenRESENDINGretries, then long‑runningSENDING. That fits the intent of the reconciler and the RESENDING fix.Tiny nit:
FAILEDandSUCCESSstill share the same numeric priority while the comparator only looks at<and otherwise returns1. If this helper ever gets used on slices that include final states, that technically violates strict ordering and could make the sort result a bit non‑deterministic. If you care about that, you could either give them distinct priorities or add a tie‑breaker whenpriority[a.State] == priority[b.State].openmeter/notification/eventhandler/webhook.go (3)
57-82: Hoistingerr/webhooksOut/msgclarifies shared provider stateMoving
err,webhooksOut, andmsgto this scope makes it much clearer that you fetch webhooks and the provider message once per event and then reuse that state across all delivery statuses. That matches how the reconcile loop is written and plays nicely with the later checks onerrandmsgforPENDING/SENDINGstates.One small readability thought: the same
errvariable is also reused later forUpdateEventDeliveryStatus(Line 422), so a repo error in one iteration will overwrite the provider error that subsequent iterations see. That behavior existed before this refactor, but it’s a bit surprising when you’re mentally tracking which error thewebhook.Is*Errorchecks are referring to. If you want to make this more obvious/robust, splitting it into e.g.providerErr(for get/send) andupdateErr(for the repo writes) would keep the control flow easier to reason about and avoid any chance of DB errors influencing webhook error handling.
152-167: Routing validation errors into the unrecoverable branch looks goodAdding
webhook.IsValidationError(err)alongsideIsUnrecoverableErrorin thePENDINGbranch is a nice touch — payload/validation issues aren’t realistically recoverable by retrying the provider, so treating them as final failures is a sane choice. The extra structured fields in the error log (notification.delivery_status.state,notification.event.id, etc.) should also make debugging much easier.Minor wording nit: the log message and span text still talk about “fetching webhook message from provider” even though, with the new ValidationError path, some of those errors can now originate from
sendWebhookMessage(payload serialization) rather than just the fetch. Not a big deal, but if logs are used heavily for triage, you might want to tweak the wording to something a bit more generic like “handling webhook message at provider”.
243-257: Keeping resend errors local fixesRESENDINGreconciliation without breakingPENDING/SENDINGThe local
err := h.resendWebhookMessage(...)here is exactly what you want: errors from the resend operation no longer stomp on the outererrthat drives thePENDING/SENDINGlogic. That lines up with the PR description and avoids the situation where a resend failure accidentally gets interpreted as a provider fetch/send error for other statuses in the same reconcile run.Tiny comment nit: “Ignore any error from the resend operation” is a bit ambiguous since you still append it to
errs(which is good). If you care about precision, something like “Ignore resend errors for state transitions, but still surface them viaerrs” would match what the code actually does.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
openmeter/notification/eventhandler/deliverystatus.go(1 hunks)openmeter/notification/eventhandler/webhook.go(6 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.go
⚙️ CodeRabbit configuration file
**/*.go: In general when reviewing the Golang code make readability and maintainability a priority, even potentially suggest restructuring the code to improve them.Performance should be a priority in critical code paths. Anything related to event ingestion, message processing, database operations (regardless of database) should be vetted for potential performance bottlenecks.
Files:
openmeter/notification/eventhandler/webhook.goopenmeter/notification/eventhandler/deliverystatus.go
🧠 Learnings (2)
📚 Learning: 2025-11-07T12:29:09.087Z
Learnt from: chrisgacsal
Repo: openmeterio/openmeter PR: 3581
File: openmeter/notification/eventhandler/webhook.go:391-396
Timestamp: 2025-11-07T12:29:09.087Z
Learning: In the OpenMeter notification system (specifically in openmeter/notification/eventhandler/webhook.go), webhook channels are equivalent to notification rules. When sending webhook messages, the rule ID (event.Rule.ID) should be used as the channel identifier in the Channels field of SendMessageInput.
Applied to files:
openmeter/notification/eventhandler/webhook.go
📚 Learning: 2025-11-07T11:28:40.430Z
Learnt from: chrisgacsal
Repo: openmeterio/openmeter PR: 3581
File: openmeter/notification/webhook/svix/webhook.go:253-257
Timestamp: 2025-11-07T11:28:40.430Z
Learning: In the openmeter notification system (specifically in openmeter/notification/webhook/svix/webhook.go), event type filtering for webhooks is not allowed or supported. Only channel-based filtering is used.
Applied to files:
openmeter/notification/eventhandler/webhook.go
🧬 Code graph analysis (2)
openmeter/notification/eventhandler/webhook.go (2)
openmeter/notification/webhook/handler.go (3)
Webhook(14-28)ListWebhooksInput(32-38)Message(187-207)openmeter/notification/webhook/errors.go (3)
IsUnrecoverableError(150-152)IsValidationError(45-47)NewValidationError(37-43)
openmeter/notification/eventhandler/deliverystatus.go (1)
openmeter/notification/deliverystatus.go (4)
EventDeliveryStatusStateResending(18-18)EventDeliveryStatusStateSending(16-16)EventDeliveryStatusStateFailed(15-15)EventDeliveryStatusStateSuccess(14-14)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: Artifacts / Benthos Collector Container image
- GitHub Check: Artifacts / Container image
- GitHub Check: Migration Checks
- GitHub Check: Lint
- GitHub Check: Code Generators
- GitHub Check: Test
- GitHub Check: Build
- GitHub Check: Repository Scan
- GitHub Check: Analyze (go)
🔇 Additional comments (1)
openmeter/notification/eventhandler/webhook.go (1)
536-541: Wrapping payload serialization failures inValidationErrorfits the error modelReturning
webhook.NewValidationError(fmt.Errorf("failed to serialize webhook message payload: %w", err))oneventAsPayloadfailure makes the error taxonomy much cleaner: reconcile can now explicitly treat bad/unsupported payloads as validation problems, and thePENDINGbranch’sIsValidationErrorcheck correctly routes them into the unrecoverable path instead of retrying.You still preserve useful context via the wrapped message, and this plays nicely with the rest of the webhook error helpers. Also nice to see
Channels: []string{event.Rule.ID}is still used here, which is consistent with the “rule ID as channel identifier” behavior of the webhook subsystem. Based on learnings.
Overview
Fix reconcile state of delivery status when re-sending event by not overriding
errwhich is used to control reconcile flow for delivery statuses inPENDINGorSENDINGstates. This is a temporary measure until there is a caching logic implemented for reconciler to prevent re-fetching resources from webhook provider for each access request.Summary by CodeRabbit
Bug Fixes
Refactor