Skip to content

Docker image fails to start if celery config section is not defined #29537

@Limess

Description

@Limess

Apache Airflow version

Other Airflow 2 version (please specify below)

What happened

Using Airflow 2.3.4

We removed any config values we did not explicitly set from airflow.cfg. This was to make future upgrades less involved, as we could only compare configuration values we explicitly set, rather than all permutations of versions. This has been recommended in slack as an approach.

e.g. we set AIRFLOW__CELERY__BROKER_URL as an environment variable - we do not set this in airflow.cfg, so we removed the [celery] section from the Airflow configuration.

We set AIRFLOW__CORE__EXECUTOR=CeleryExecutor, so we are using the Celery executor.

Upon starting the Airflow scheduler, it exited with code 1, and this message:

The section [celery] is not found in config.

Upon adding back in an empty

[celery]

section to airflow.cfg, this error went away. I have verified that it still picks up AIRFLOW__CELERY__BROKER_URL correctly.

What you think should happen instead

I'd expect Airflow to take defaults as listed here, I wouldn't expect the presence of configuration sections to cause errors.

How to reproduce

  1. Setup a docker image for the Airflow scheduler with apache/airflow:slim-2.3.4)-python3.10 and the following configuration in airflow.cfg - with no [celery] section:

    [core]
    # The executor class that airflow should use. Choices include
    # ``SequentialExecutor``, ``LocalExecutor``, ``CeleryExecutor``, ``DaskExecutor``,
    # ``KubernetesExecutor``, ``CeleryKubernetesExecutor`` or the
    # full import path to the class when using a custom executor.
    executor = CeleryExecutor
    
    [logging]
    [metrics]
    [secrets]
    [cli]
    [debug]
    [api]
    [lineage]
    [atlas]
    [operators]
    [hive]
    [webserver]
    [email]
    [smtp]
    [sentry]
    [celery_kubernetes_executor]
    [celery_broker_transport_options]
    [dask]
    [scheduler]
    [triggerer]
    [kerberos]
    [github_enterprise]
    [elasticsearch]
    [elasticsearch_configs]
    [kubernetes]
    [smart_sensor]
    
  2. Run the scheduler command, also setting AIRFLOW__CELERY__BROKER_URL to point to a Celery redis broker.

  3. Observe that the scheduler exits.

Operating System

Ubuntu 20.04.5 LTS (Focal Fossa)

Versions of Apache Airflow Providers

No response

Deployment

Other Docker-based deployment

Deployment details

AWS ECS

Docker apache/airflow:slim-2.3.4)-python3.10

Separate:

  • Webserver
  • Triggerer
  • Scheduler
  • Celery worker
  • Celery flower
    services

Anything else

This seems to occur due to this get-value check in the Airflow image entrypoint:

function wait_for_celery_broker() {
# Verifies connection to Celery Broker
local executor
executor="$(airflow config get-value core executor)"
if [[ "${executor}" == "CeleryExecutor" ]]; then
local connection_url
connection_url="$(airflow config get-value celery broker_url)"
wait_for_connection "${connection_url}"
fi
}

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions