Skip to content

Reduce timeouts for batched update #4814

Closed

Description

Description

The batched update DAGs have default timeouts that are quite long, mostly because we were unsure how they would shake out once we started running the popularity update regularly:

DAGRUN_TIMEOUT = timedelta(days=31 * 3)
SELECT_TIMEOUT = timedelta(hours=24)
UPDATE_TIMEOUT = timedelta(days=30 * 3) # 3 months

We could have separate timeout values for the automated and manual batched update runs, but for the time being I think we can safely change the following values:

  • UPDATE_TIMEOUT should be set to 30 days (rather than 3 months)
  • DAGRUN_TIMEOUT should be set to UPDATE_TIMEOUT + SELECT_TIMEOUT (rather than explicitly setting the timeout)

Additional context

The longest batched_update run we've had was just under 12 days, and that was the automated run for Flickr's popularity update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    good first issueNew-contributor friendlyhelp wantedOpen to participation from the community💻 aspect: codeConcerns the software code in the repository🟩 priority: lowLow priority and doesn't need to be rushed🧰 goal: internal improvementImprovement that benefits maintainers, not users🧱 stack: catalogRelated to the catalog and Airflow DAGs

    Type

    No type

    Projects

    • Status

      ✅ Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions