Skip to content

Conversation

@amoghrajesh
Copy link
Contributor

Previously, when GitHook creation failed (take for example a simple case of git connection not found), the exception was caught and suppressed, allowing bundle initialization to proceed in a way, which was a facade. This led to a misleading error message later: "Connection {conn_id} doesn't have a host url" instead of the actual error.

I am making this change to ensure that exceptions propagate immediately during bundle creation with clear error messages. The DAG processor handles exceptions per-bundle, so one failing bundle doesn't prevent others from loading successfully.

For testing, use a config like this where git connection doesnt exist:

export AIRFLOW__DAG_PROCESSOR__DAG_BUNDLE_CONFIG_LIST='[
  {
    "name": "test-git-bundle",
    "classpath": "airflow.providers.git.bundles.git.GitDagBundle",
    "kwargs": {
      "tracking_ref": "main",
      "git_conn_id": "non_existent_connection"
    }
  },
  {
    "name": "local-bundle",
    "classpath": "airflow.dag_processing.bundles.local.LocalDagBundle",
    "kwargs": {
      "path": "/Users/amoghdesai/Documents/OSS/repos/airflow/files/test_dags",
      "refresh_interval": 0
    }
  }
]'

Before it looked like this:

2025-12-05T07:18:54.567663Z [error    ] Error initializing bundle test-git-bundle: Connection non_existent_connection doesn't have a host url [airflow.dag_processing.manager.DagFileProcessorManager] loc=manager.py:528
Traceback (most recent call last):
  File "/opt/airflow/airflow-core/src/airflow/dag_processing/manager.py", line 526, in _refresh_dag_bundles
    bundle.initialize()
  File "/opt/airflow/providers/git/src/airflow/providers/git/bundles/git.py", line 134, in initialize
    raise AirflowException(f"Connection {self.git_conn_id} doesn't have a host url")
airflow.sdk.exceptions.AirflowException: Connection non_existent_connection doesn't have a host url
2025-12-05T07:18:59.647017Z [error    ] Error initializing bundle test-git-bundle: Connection non_existent_connection doesn't have a host url [airflow.dag_processing.manager.DagFileProcessorManager] loc=manager.py:528
Traceback (most recent call last):
  File "/opt/airflow/airflow-core/src/airflow/dag_processing/manager.py", line 526, in _refresh_dag_bundles
    bundle.initialize()
  File "/opt/airflow/providers/git/src/airflow/providers/git/bundles/git.py", line 134, in initialize
    raise AirflowException(f"Connection {self.git_conn_id} doesn't have a host url")
airflow.sdk.exceptions.AirflowException: Connection non_existent_connection doesn't have a host url

Now it looks like this:

root@c435d9bb0884:/opt/airflow# airflow dag-processor
2025-12-05T08:45:11.600794Z [warning  ] Failed to convert value. Please check memray_trace_components key in profiling section. it must be one of scheduler, dag_processor, api, if not the value is ignored [airflow._shared.configuration.parser] loc=parser.py:1127
__file__='/files/plugins/triggera.py' loaded
__file__='/files/plugins/triggera_comprehensive.py' loaded
2025-12-05T08:45:12.101992Z [info     ] Starting the Dag Processor Job [airflow.jobs.dag_processor_job_runner.DagProcessorJobRunner] loc=dag_processor_job_runner.py:59
2025-12-05T08:45:12.102262Z [info     ] Processing files using up to 2 processes at a time  [airflow.dag_processing.manager.DagFileProcessorManager] loc=manager.py:266
2025-12-05T08:45:12.102305Z [info     ] Process each file at most once every 30 seconds [airflow.dag_processing.manager.DagFileProcessorManager] loc=manager.py:267
2025-12-05T08:45:12.168422Z [info     ] DAG bundles loaded: test-git-bundle, local-bundle [airflow.dag_processing.bundles.manager.DagBundlesManager] loc=manager.py:209
2025-12-05T08:45:12.174930Z [error    ] Could not create GitHook       [airflow.providers.git.bundles.git] bare_repo_path=PosixPath('/tmp/airflow/dag_bundles/test-git-bundle/bare') bundle_name=test-git-bundle conn_id=non_existent_connection exc=AirflowNotFoundException("The conn_id `non_existent_connection` isn't defined") git_conn_id=non_existent_connection loc=git.py:94 repo_path=PosixPath('/tmp/airflow/dag_bundles/test-git-bundle/tracking_repo') version=None versions_path=PosixPath('/tmp/airflow/dag_bundles/test-git-bundle/versions')
Traceback (most recent call last):
  File "/opt/airflow/providers/git/src/airflow/providers/git/bundles/git.py", line 92, in __init__
    self.hook = GitHook(git_conn_id=git_conn_id or "git_default", repo_url=self.repo_url)
  File "/opt/airflow/providers/git/src/airflow/providers/git/hooks/git.py", line 69, in __init__
    connection = self.get_connection(git_conn_id)
  File "/opt/airflow/task-sdk/src/airflow/sdk/bases/hook.py", line 61, in get_connection
    conn = Connection.get(conn_id)
  File "/opt/airflow/task-sdk/src/airflow/sdk/definitions/connection.py", line 224, in get
    return _get_connection(conn_id)
  File "/opt/airflow/task-sdk/src/airflow/sdk/execution_time/context.py", line 176, in _get_connection
    raise AirflowNotFoundException(f"The conn_id `{conn_id}` isn't defined")
airflow.sdk.exceptions.AirflowNotFoundException: The conn_id `non_existent_connection` isn't defined
2025-12-05T08:45:12.175495Z [error    ] Error creating bundle 'test-git-bundle': The conn_id `non_existent_connection` isn't defined [airflow.dag_processing.bundles.manager.DagBundlesManager] loc=manager.py:249
Traceback (most recent call last):
  File "/opt/airflow/airflow-core/src/airflow/dag_processing/bundles/manager.py", line 247, in sync_bundles_to_db
    new_template, new_params = _extract_and_sign_template(name)
  File "/opt/airflow/airflow-core/src/airflow/dag_processing/bundles/manager.py", line 216, in _extract_and_sign_template
    bundle_instance = self.get_bundle(name)
  File "/opt/airflow/airflow-core/src/airflow/dag_processing/bundles/manager.py", line 337, in get_bundle
    return cfg_bundle.bundle_class(name=name, version=version, **cfg_bundle.kwargs)
  File "/opt/airflow/providers/git/src/airflow/providers/git/bundles/git.py", line 92, in __init__
    self.hook = GitHook(git_conn_id=git_conn_id or "git_default", repo_url=self.repo_url)
  File "/opt/airflow/providers/git/src/airflow/providers/git/hooks/git.py", line 69, in __init__
    connection = self.get_connection(git_conn_id)
  File "/opt/airflow/task-sdk/src/airflow/sdk/bases/hook.py", line 61, in get_connection
    conn = Connection.get(conn_id)
  File "/opt/airflow/task-sdk/src/airflow/sdk/definitions/connection.py", line 224, in get
    return _get_connection(conn_id)
  File "/opt/airflow/task-sdk/src/airflow/sdk/execution_time/context.py", line 176, in _get_connection
    raise AirflowNotFoundException(f"The conn_id `{conn_id}` isn't defined")
airflow.sdk.exceptions.AirflowNotFoundException: The conn_id `non_existent_connection` isn't defined
2025-12-05T08:45:12.175815Z [warning  ] Removing ownership of team 'None' from Dag bundle 'local-bundle' [airflow.dag_processing.bundles.manager.DagBundlesManager] loc=manager.py:28

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@amoghrajesh amoghrajesh force-pushed the dagmanager-error-handling branch from ac325a8 to 14559b6 Compare December 8, 2025 07:42
@amoghrajesh
Copy link
Contributor Author

Sorry for the spam folks, possibly due to a bad rebase I had!

@amoghrajesh amoghrajesh merged commit 85b65eb into apache:main Dec 8, 2025
121 checks passed
@amoghrajesh amoghrajesh deleted the dagmanager-error-handling branch December 8, 2025 15:05
@amoghrajesh
Copy link
Contributor Author

Ahhh missed the v3 label!

@amoghrajesh
Copy link
Contributor Author

Oh this one will need manual port over.

@amoghrajesh
Copy link
Contributor Author

Manual port to v3 branch here: #59236

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants