-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Description
Apache Airflow version
main (development)
What happened
When dag processor starts with a sub directory to process then the import errors are recorded with that path. So when there is processor for airflow-dag-processor-0 folder in order to remove import errors it lists all files under airflow-dag-processor-0 folder and deletes those not present. This becomes an issue when there is airflow-dag-processor-1 that records import errors whose files won't be part of airflow-dag-processor-0 folder.
What you think should happen instead
The fix would be to have processor_subdir stored in ImportError table so that during querying we only look at import errors relevant to the dag processor and don't delete other items. A fix similar to #33357 needs to be applied for import errors as well.
How to reproduce
- create a dag file with import error at
~/airflow/dags/airflow-dag-processor-0/sample_sleep.py. Start a dag processor with -S to process "~/airflow/dags/airflow-dag-processor-0/" . Import error should be present. - create a dag file with import error at
~/airflow/dags/airflow-dag-processor-1/sample_sleep.py. Start a dag processor with -S to process "~/airflow/dags/airflow-dag-processor-1/". Import error for airflow-dag-processor-0 is deleted.
from datetime import datetime, timedelta
from airflow import DAG
from airflow.decorators import task
from datetime import timedelta, invalid
with DAG(
dag_id="task_duration",
start_date=datetime(2023, 1, 1),
catchup=True,
schedule_interval="@daily",
) as dag:
@task
def sleeper():
pass
sleeper()Operating System
Ubuntu
Versions of Apache Airflow Providers
No response
Deployment
Virtualenv installation
Deployment details
No response
Anything else
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct