Skip to content

AIRFLOW__SCHEDULER__TASK_QUEUED_TIMEOUT configuration ignored #59553

@CreeperBeatz

Description

@CreeperBeatz

Apache Airflow version

3.1.5

If "Other Airflow 3 version" selected, which one?

No response

What happened?

For my workload, I ingest video files and then create tasks to process each one. It is normal to create 30+ tasks that will get processed in the span of 40+min. On backfill, it's absolutely normal for my GPU workers to not be available and for a task to wait 1h+ to be executed.

This is why I changed AIRFLOW__SCHEDULER__TASK_QUEUED_TIMEOUT to 10 000 (see image below).

Image

I also changed my task definition to allow a task to be in processing for 24h, so this is not the issue either.

Image

However, even when setting that, my tasks fail and the retry mechanism kicks in. I've bandaged fixed it by increasing the number of retries, but it's not optimal.

Image

What you think should happen instead?

AIRFLOW__SCHEDULER__TASK_QUEUED_TIMEOUT should be taken into account, and tasks should not timeout on around 15min.

All tasks eventually complete. With 20 runs × 2.5 min each and 1 worker, the last task should wait ~50 min in the queue - well within the 100,000s timeout.

How to reproduce

1. Install Airflow

uv pip install "apache-airflow[celery]==3.1.5" \
    --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-3.1.5/constraints-3.12.txt"

2. Initialize and configure

airflow db migrate
sed -i 's/^executor = .*/executor = CeleryExecutor/' airflow.cfg
sed -i 's/^task_queued_timeout = .*/task_queued_timeout = 100000.0/' airflow.cfg

3. Create the DAG

Save as dags/long_running_task_dag.py:

from datetime import datetime, timedelta
from time import sleep

from airflow.sdk import dag, task


@dag(
    dag_id="long_running_task_bug_repro",
    schedule=None,
    start_date=datetime(2025, 1, 1),
    catchup=False,
    default_args={
        "owner": "airflow",
        "retries": 0,
    },
)
def long_running_task_bug_repro():

    @task(
        queue="gpu_queue",
        execution_timeout=timedelta(hours=24),
    )
    def gpu_simulation():
        """Simulate GPU processing - sleeps for 2.5 minutes."""
        sleep(150)

    gpu_simulation()


long_running_task_bug_repro()

4. Start services

Terminal 1 - Scheduler:

airflow api-server --port 8080 &
airflow scheduler &
airflow dag-processor &
airflow triggerer &
airflow celery worker --queues default

Terminal 2 - Single worker (simulates limited GPU resources):

airflow celery worker --queues gpu_queue --concurrency 1

5. Trigger multiple DAG runs

for i in $(seq 1 20); do
    airflow dags trigger long_running_task_bug_repro
done

6. Wait ~15 minutes

Operating System

Ubuntu LTS 22.04

Versions of Apache Airflow Providers

apache-airflow-providers-celery==3.14.0
apache-airflow-providers-common-compat==1.10.0
apache-airflow-providers-common-io==1.7.0
apache-airflow-providers-common-sql==1.30.0
apache-airflow-providers-smtp==2.4.0
apache-airflow-providers-standard==1.10.0

Deployment

Virtualenv installation

Deployment details

No response

Anything else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions