Skip to content

Add resilience controls for WeatherAPI/FCM dependency outages #51

@BadgerOps

Description

@BadgerOps

Problem

Weather and push paths depend on external services (WeatherAPI, FCM) with limited resilience controls, so transient outages can degrade UOTD recommendations and notifications.

Proposed Fix

  • Add timeout + retry with jitter/backoff for external calls
  • Add fallback behavior (last-known-good weather / safe default UOTD path)
  • Add health/staleness alerts (e.g., weather cache expiration drift, scheduler failure rate)
  • Document incident runbook for degraded mode

Acceptance Criteria

  • Transient external failures do not hard-fail core workflows
  • Stale weather cache and scheduler failures alert operators
  • Degraded mode behavior is explicit and tested
  • Runbook exists for on-call/admin recovery steps

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions