Skip to content

Task with retries can circument max_active_runs limit #42093

@jmaicher

Description

@jmaicher

Apache Airflow version

2.9.3

If "Other Airflow 2 version" selected, which one?

No response

What happened?

We have some DAGs that cannot run in parallel. To prevent parallel execution, we configured max_active_runs=1. We also configured retries. Recently, we observed a case where Airflow still scheduled two parallel DAG runs. We reconstructed what happened from the audit logs and can reliably reproduce it:

GIVEN a DAG with max_active_runs=1 and a task with retries > 0
WHEN the task is running in the context of run A
AND the user manually marks run A as failed (or success)
AND the user clears multiple runs including run A shortly afterwards
AND the scheduler starts the task in the context of another run B
THEN the task of run A is marked as "UP_FOR_RETRY" and restarts after backoff (5 minutes by default) regardless of whether another run is already active

What you think should happen instead?

  • Airflow should not schedule two parallel runs when max_active_runs=1
  • Airflow should not retry when the user marks run as failed/success and clears it shortly after

How to reproduce

See above. Using Kubernetes executor (or similar) is likely necessary to reproduce this, as it extends the time between the user action (mark as failed/success) and the retrieval of SIGTERM in the task instance. We also used a task that sleeps longer than the retry backoff (5m by default) to actually see the two runs running in parallel.

Operating System

debian 12 (bookworm)

Versions of Apache Airflow Providers

apache-airflow-providers-cncf-kubernetes==8.3.3

Deployment

Official Apache Airflow Helm Chart

Deployment details

Workload runs via kubernetes executor and kubernetes pod operator.

Anything else?

Rarely, but if it does, it causes severe problems as the DAG/task cannot run in parallel.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions