-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Description
Apache Airflow version
main (development)
If "Other Airflow 2 version" selected, which one?
No response
What happened?
II've noticed that OpenLineage's listener is sending two FAIL events when task has some retries left but the problem is not OpenLineage-related - the listener_manager gets called twice.
The problem is that this get_listener_manager().hook.on_task_instance_failed listener's call is done on the scheduler, exactly here. This is because the task is falling into the if for tasks killed_externally, and the ti.handle_failure is called (and it has the listener's call inside).
What you think should happen instead?
I think that this simple task execution should not be treated as external task state change and for sure, the listener should be called once.
How to reproduce
- Run this DAG on latest main.
- Look in the scheduler logs, you should find ERROR logs
Executor CeleryExecutor ... reported that the task instance ... finished with state success, but the task instance's state attribute is runningand then some logs about listener's on_task_instance_failed being called (on DEBUG level)
import datetime as dt
from airflow import DAG
from airflow.providers.standard.operators.bash import BashOperator
with DAG(
dag_id="dag_failure_wait",
start_date=dt.datetime(2024, 7, 3),
schedule=None,
catchup=False,
) as dag:
task_failure = BashOperator(
task_id="task_failure",
bash_command="sleep 2 && exit 1;",
retry_delay=1,
retries=1,
)
Operating System
MacOS
Versions of Apache Airflow Providers
latest main
Deployment
Virtualenv installation
Deployment details
Breeze, with LocalExecutor / CeleryExecutor (tested both).
breeze start-airflow -b postgres
breeze start-airflow --integration openlineage -b postgres
breeze start-airflow --integration openlineage -b postgres --executor=CeleryExecutor
Anything else?
Logs for CeleryExecutor run:
log_celery_20250408_105902.txt
log_scheduler_20250408_105917.txt
Logs for LocalExecutor run:
log_scheduler_20250408_110833.txt
Logs for LocalExecutor without the OpenLineage integration:
log_scheduler_20250408_111105.txt
OL events received for the run (notice two FAIL events for first task run):
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct
