Skip to content

dag processor deletes import errors of other dag processors thinking the files don't exist #35949

@tirkarthi

Description

@tirkarthi

Apache Airflow version

main (development)

What happened

When dag processor starts with a sub directory to process then the import errors are recorded with that path. So when there is processor for airflow-dag-processor-0 folder in order to remove import errors it lists all files under airflow-dag-processor-0 folder and deletes those not present. This becomes an issue when there is airflow-dag-processor-1 that records import errors whose files won't be part of airflow-dag-processor-0 folder.

What you think should happen instead

The fix would be to have processor_subdir stored in ImportError table so that during querying we only look at import errors relevant to the dag processor and don't delete other items. A fix similar to #33357 needs to be applied for import errors as well.

How to reproduce

  1. create a dag file with import error at ~/airflow/dags/airflow-dag-processor-0/sample_sleep.py . Start a dag processor with -S to process "~/airflow/dags/airflow-dag-processor-0/" . Import error should be present.
  2. create a dag file with import error at ~/airflow/dags/airflow-dag-processor-1/sample_sleep.py . Start a dag processor with -S to process "~/airflow/dags/airflow-dag-processor-1/". Import error for airflow-dag-processor-0 is deleted.
from datetime import datetime, timedelta

from airflow import DAG
from airflow.decorators import task

from datetime import timedelta, invalid


with DAG(
    dag_id="task_duration",
    start_date=datetime(2023, 1, 1),
    catchup=True,
    schedule_interval="@daily",
) as dag:

    @task
    def sleeper():
        pass

    sleeper()

Operating System

Ubuntu

Versions of Apache Airflow Providers

No response

Deployment

Virtualenv installation

Deployment details

No response

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:corekind:bugThis is a clearly a bugneeds-triagelabel for new issues that we didn't triage yet

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions