Skip to content

Numerous error reports in airflow logs due to function argument mismatch #40011

@leakec

Description

@leakec

Apache Airflow version

2.9.1

If "Other Airflow 2 version" selected, which one?

No response

What happened?

In the airflow logs I'm seeing this error over and over again

2024-06-02 07:29:30,369 ERROR - Error syncing the Celery executor, ignoring it.
Traceback (most recent call last):
  File "/home/leake/.local/lib/python3.12/site-packages/airflow/providers/celery/executors/celery_executor.py", line 362, in update_task_state
    self.success(key, info)
  File "/home/leake/.local/lib/python3.12/site-packages/airflow/executors/base_executor.py", line 337, in success
    self.change_state(key, TaskInstanceState.SUCCESS, info)
  File "/home/leake/.local/lib/python3.12/site-packages/airflow/providers/celery/executors/celery_executor.py", line 351, in change_state
    super().change_state(key, state, info, remove_running=remove_running)
TypeError: BaseExecutor.change_state() got an unexpected keyword argument 'remove_running'

Ultimately, this seems to crash the scheduler after only an hour or so.

What you think should happen instead?

Looking into the code near this point, there is a try/except block that is supposed to catch this error if the airflow version is less than 2.9.2. The except block catches the AttributeError type. I believe what is supposed to happen is the except block is triggered, and we call a similar function signature, but without the remove_running keyword argument.

However, at least in Python 3.12, the error type is TypeError rather than AttributeError, so the except blocks is not triggered and we get an error report instead.

How to reproduce

Create an airflow server and that has these packages

pip install 'apache-airflow[celery]' 'apache-airflow[ldap]'

not sure if ldap is really needed to reproduce the bug or not. Start an airflow instance that uses the CeleryExecutor and attach a worker. Once you run a job, you should see the error report above in the logs. This may require a specific Python version to reproduce. I'm not sure as I've only tried on Python 3.12.

Operating System

Fedora 40

Versions of Apache Airflow Providers

apache-airflow-providers-celery==3.7.1
apache-airflow-providers-common-io==1.3.1
apache-airflow-providers-common-sql==1.13.0
apache-airflow-providers-fab==1.1.0
apache-airflow-providers-ftp==3.9.0
apache-airflow-providers-http==4.11.0
apache-airflow-providers-imap==3.6.0
apache-airflow-providers-redis==3.7.0
apache-airflow-providers-smtp==1.7.0
apache-airflow-providers-sqlite==3.8.0

Deployment

Other

Deployment details

I just have a local instance that I've created myself by tweaking the default config file that comes with airflow. I essentially followed the quick start guide, and switched over to using MySQL and Celery with Redis as the broker.

Anything else?

This error gets reported really frequently. I think may ultimately be causing the scheduler to crash. I did some local testing, and changing the except line from

except AttributeError:

to

except (AttributeError, TypeError)

fixes the issue, i.e., we jump into the except block as expected. I'll submit a PR with this change shortly.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions