Skip to content

Constant Inode flips on LocalExecutor cause task failures (FileNotFoundError) #61396

@Deebol

Description

@Deebol

Apache Airflow version

3.1.6

If "Other Airflow 3 version" selected, which one?

3.1.5

What happened?

Hi.

In Airflow 3.1.6 (and 3.1.5, issue doesn't exist on 3.1.3), GitDagBundle with LocalExecutor performs a full git clone into a new directory (new Inode) and deletes the old one almost every time a task starts.

This "Inode flipping" causes running tasks (e.g.: dbt via Cosmos) to lose their file descriptors to the DAG folder, resulting in FileNotFoundError or Directory not found errors.

I executed such script inside airflow scheduler pod to detect ongoing git operations:

python - <<'PY'
import os, time, subprocess
from datetime import datetime, timezone

TARGET = "/tmp/airflow/dag_bundles/dh-pipeline-dags/versions/<provide sha>"
INTERVAL = 0.05

def now():
  return datetime.now(timezone.utc).strftime("%H:%M:%S.%f")[:-3]

seen_pids = set()

print(f"{now()} START monitoring GIT processes...")

while True:
  try:
    pids = [pid for pid in os.listdir('/proc') if pid.isdigit()]
    for pid in pids:
      if pid not in seen_pids:
        try:
          with open(f"/proc/{pid}/cmdline", "rb") as f:
            cmd = f.read().replace(b"\x00", b" ").decode("utf-8", "ignore").strip()

          if "git" in cmd.lower():
            with open(f"/proc/{pid}/stat", "rb") as f:
              stat_parts = f.read().split()
              ppid = stat_parts[3].decode()

            inode_info = "N/A"
            if os.path.exists(TARGET):
              try:
                inode_info = os.stat(TARGET).st_ino
              except: pass

            print(f"{now()} GIT DETECTED! PID={pid} PPID={ppid} INODE_BASE={inode_info}")
            print(f"   CMD: {cmd[:150]}")

            seen_pids.add(pid)
        except (FileNotFoundError, ProcessLookupError):
          continue
  except Exception as e:
    print(f"Error: {e}")

  if len(seen_pids) > 1000:
    seen_pids.clear()

  time.sleep(INTERVAL)
PY

And this is what I got on 3.1.6:

...
13:33:06.182 GIT DETECTED! PID=16645 PPID=16640 INODE_BASE=139948383
   CMD: /usr/lib/git-core/git rev-list --objects --stdin --not --all --quiet --alternate-refs
13:33:06.337 GIT DETECTED! PID=16648 PPID=16637 INODE_BASE=406416615
   CMD: git clone -v -- /tmp/airflow/dag_bundles/dh-pipeline-dags/bare /tmp/airflow/dag_bundles/dh-pipeline-dags/versions/1108133dd6b2afb806d3531e9a97f8b15994
13:33:06.337 GIT DETECTED! PID=16649 PPID=16648 INODE_BASE=406416615
   CMD: /bin/sh -c git-upload-pack '/tmp/airflow/dag_bundles/dh-pipeline-dags/bare' git-upload-pack '/tmp/airflow/dag_bundles/dh-pipeline-dags/bare'
13:33:06.337 GIT DETECTED! PID=16650 PPID=16649 INODE_BASE=406416615
   CMD: git-upload-pack /tmp/airflow/dag_bundles/dh-pipeline-dags/bare
13:33:06.491 GIT DETECTED! PID=16652 PPID=16637 INODE_BASE=406416615
   CMD: git cat-file --batch-check
13:33:06.491 GIT DETECTED! PID=16653 PPID=16637 INODE_BASE=406416615
   CMD: git reset --hard HEAD --
13:33:09.825 GIT DETECTED! PID=16658 PPID=16655 INODE_BASE=406416615
   CMD: git fetch -v -- origin +refs/heads/*:refs/heads/* +refs/tags/*:refs/tags/*
13:33:10.339 GIT DETECTED! PID=16666 PPID=16655 INODE_BASE=473240196
   CMD: git clone -v -- /tmp/airflow/dag_bundles/dh-pipeline-dags/bare /tmp/airflow/dag_bundles/dh-pipeline-dags/versions/1108133dd6b2afb806d3531e9a97f8b15994
13:33:10.339 GIT DETECTED! PID=16667 PPID=16666 INODE_BASE=473240196
   CMD: /bin/sh -c git-upload-pack '/tmp/airflow/dag_bundles/dh-pipeline-dags/bare' git-upload-pack '/tmp/airflow/dag_bundles/dh-pipeline-dags/bare'
13:33:10.339 GIT DETECTED! PID=16668 PPID=16667 INODE_BASE=473240196
   CMD: git-upload-pack /tmp/airflow/dag_bundles/dh-pipeline-dags/bare
13:33:10.441 GIT DETECTED! PID=16669 PPID=16655 INODE_BASE=473240196
   CMD: git checkout qa/deployed
...

git clone is happening and INODE_BASE is changing.

On 3.1.3 I got:

...
13:16:45.591 GIT DETECTED! PID=354362 PPID=354358 INODE_BASE=14307405
   CMD: git checkout dev/deployed
13:16:46.580 GIT DETECTED! PID=354373 PPID=354366 INODE_BASE=14307405
   CMD: git cat-file --batch-check
13:16:49.630 GIT DETECTED! PID=354397 PPID=354393 INODE_BASE=14307405
   CMD: git checkout dev/deployed
13:16:49.990 GIT DETECTED! PID=354405 PPID=354400 INODE_BASE=14307405
   CMD: git cat-file --batch-check
13:16:49.990 GIT DETECTED! PID=354406 PPID=354400 INODE_BASE=14307405
   CMD: git reset --hard HEAD --
13:17:01.411 GIT DETECTED! PID=354439 PPID=354434 INODE_BASE=14307405
   CMD: git cat-file --batch-check
13:17:01.411 GIT DETECTED! PID=354440 PPID=354434 INODE_BASE=14307405
   CMD: git reset --hard HEAD --
13:17:04.472 GIT DETECTED! PID=354443 PPID=354441 INODE_BASE=14307405
   CMD: git version
...

No git-clone's and INODE_BASE is not changing.

Outcome: Tasks are failing frequently. Example logs of failed tasks:

[2026-02-03 14:50:02] ERROR - Failed to import: /tmp/airflow/dag_bundles/dh-pipeline-dags/versions/1108133dd6b2afb806d3531e9a97f8b15994d812/dags/pipelines/transform_uk_flood_monitoring.py
CosmosLoadDbtException: Unable to run ['/home/airflow/.local/bin/dbt', 'ls', '--output', 'json', '--output-keys', 'name', 'unique_id', 'resource_type', 'depends_on', 'original_file_path', 'tags', 'config', 'freshness', '--project-dir', '/tmp/tmpv45sxgtr', '--profiles-dir', '/tmp/airflow/dag_bundles/dh-pipeline-dags/versions/1108133dd6b2afb806d3531e9a97f8b15994d812/dbt', '--profile', 'xc_dh_extdata_model', '--target', 'xcloud', '--select', 'tag:live_flood_monitoring_readings,tag:raw'] due to the error:
stderr: Path '/tmp/airflow/dag_bundles/dh-pipeline-dags/versions/1108133dd6b2afb806d3531e9a97f8b15994d812/dbt' does not exist.
[2026-02-03 15:09:24] ERROR - Failed to import: /tmp/airflow/dag_bundles/dh-pipeline-dags/versions/1108133dd6b2afb806d3531e9a97f8b15994d812/dags/pipelines/transform_storm_overflow.py
CosmosLoadDbtException: Unable to run ['/home/airflow/.local/bin/dbt', 'ls', '--output', 'json', '--output-keys', 'name', 'unique_id', 'resource_type', 'depends_on', 'original_file_path', 'tags', 'config', 'freshness', '--project-dir', '/tmp/tmpm092liya', '--profiles-dir', '/tmp/airflow/dag_bundles/dh-pipeline-dags/versions/1108133dd6b2afb806d3531e9a97f8b15994d812/dbt', '--profile', 'xc_dh_extdata_model', '--target', 'xcloud', '--select', '+tag:storm_overflow,+tag:std'] due to the error:
stderr: [Errno 2] No such file or directory: '/tmp/tmpm092liya/dbt_project.yml'

What you think should happen instead?

No response

How to reproduce

  1. Setup Airflow 3.1.6 with LocalExecutor.
  2. Configure a GitDagBundle
  3. Run multiple tasks that reads files from the bundle directory (like a DbtSelectOperator from Cosmos).

Operating System

Debian GNU/Linux 12 (bookworm)

Versions of Apache Airflow Providers

apache-airflow==3.1.6
apache-airflow-core==3.1.6
apache-airflow-providers-airbyte==5.3.1
apache-airflow-providers-amazon==9.19.0
apache-airflow-providers-common-compat==1.11.0
apache-airflow-providers-common-io==1.7.0
apache-airflow-providers-common-sql==1.30.2
apache-airflow-providers-datadog==3.10.1
apache-airflow-providers-http==5.6.2
apache-airflow-providers-microsoft-azure==12.10.1
apache-airflow-providers-mongo==5.3.1
apache-airflow-providers-smtp==2.4.1
apache-airflow-providers-standard==1.10.2
apache-airflow-task-sdk==1.1.6

Deployment

Official Apache Airflow Helm Chart

Deployment details

    - name: AIRFLOW__DAG_PROCESSOR__DEFAULT_BUNDLE_NAME
      value: "dh-pipeline-dags"
    - name: AIRFLOW__DAG_PROCESSOR__DAG_BUNDLE_CONFIG_LIST
      value: >
        [
          {
            "name": "dh-pipeline-dags",
            "classpath": "airflow.providers.git.bundles.git.GitDagBundle",
            "kwargs": {
              "tracking_ref": "qa/deployed",
              "subdir": "dags",
              "git_conn_id": "git_xc_dh_pipeline",
              "refresh_interval": 300
            }
          }
        ]

Anything else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions