-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Closed
Labels
kind:bugThis is a clearly a bugThis is a clearly a bugpriority:highHigh priority bug that should be patched quickly but does not require immediate new releaseHigh priority bug that should be patched quickly but does not require immediate new release
Milestone
Description
Apache Airflow version: 2.0.0a1
What happened:
When a task fails on the first try, the log output for additional tries go to the log for the first attempt.
What you expected to happen:
The logs should go to the correct log file. For the default configuration, the log filename template is log_filename_template = {{ ti.dag_id }}/{{ ti.task_id }}/{{ ts }}/{{ try_number }}.log, so additional numbered .log files should be created.
How to reproduce it:
Create a test dag:
from datetime import timedelta
from airflow import DAG
from airflow.operators.python import PythonOperator
from airflow.utils.dates import days_ago
with DAG(
dag_id="trynumber_demo",
default_args={"start_date": days_ago(2), "retries": 1, "retry_delay": timedelta(0)},
schedule_interval=None,
) as dag:
def demo_task(ti=None):
print("Running demo_task, try_number =", ti.try_number)
if ti.try_number <= 1:
raise ValueError("Shan't")
task = PythonOperator(task_id="demo_task", python_callable=demo_task)
and trigger this dag:
$ airflow dags trigger trynumber_demo
then observe that triggernumber_demo/demo_task/<execution_date>/ only contains 1.log, which contains the full output for 2 runs:
[...]
--------------------------------------------------------------------------------
[2020-10-21 13:29:07,958] {taskinstance.py:1020} INFO - Starting attempt 1 of 2
[2020-10-21 13:29:07,959] {taskinstance.py:1021} INFO -
--------------------------------------------------------------------------------
[...]
[2020-10-21 13:29:08,163] {logging_mixin.py:110} INFO - Running demo_task, try_number = 1
[2020-10-21 13:29:08,164] {taskinstance.py:1348} ERROR - Shan't
Traceback (most recent call last):
[...]
ValueError: Shan't
[2020-10-21 13:29:08,168] {taskinstance.py:1392} INFO - Marking task as UP_FOR_RETRY. dag_id=trynumber_demo, task_id=demo_task, execution_date=20201021T122907, start_date=20201021T122907, end_date=20201021T122908
[...]
[2020-10-21 13:29:09,121] {taskinstance.py:1019} INFO -
--------------------------------------------------------------------------------
[2020-10-21 13:29:09,121] {taskinstance.py:1020} INFO - Starting attempt 2 of 2
[2020-10-21 13:29:09,121] {taskinstance.py:1021} INFO -
--------------------------------------------------------------------------------
[...]
[2020-10-21 13:29:09,333] {logging_mixin.py:110} INFO - Running demo_task, try_number = 2
[2020-10-21 13:29:09,334] {python.py:141} INFO - Done. Returned value was: None
[2020-10-21 13:29:09,355] {taskinstance.py:1143} INFO - Marking task as SUCCESS.dag_id=trynumber_demo, task_id=demo_task, execution_date=20201021T122907, start_date=20201021T122909, end_date=20201021T122909
[2020-10-21 13:29:09,404] {local_task_job.py:117} INFO - Task exited with return code 0
The TaskInstance() created for the run needs to first be refreshed from the database, before setting the logging context.
Metadata
Metadata
Assignees
Labels
kind:bugThis is a clearly a bugThis is a clearly a bugpriority:highHigh priority bug that should be patched quickly but does not require immediate new releaseHigh priority bug that should be patched quickly but does not require immediate new release