Skip to content

DbtCloudRunJobTrigger: misleading timeout error message and missing final status check #61979

@eran-moses-human

Description

@eran-moses-human

Apache Airflow Provider(s)

dbt-cloud

Versions of Apache Airflow Providers

apache-airflow-providers-dbt-cloud>=4.6.4rc1

Apache Airflow version

main (also affects Airflow 2.x with deferrable mode)

Operating System

Debian GNU/Linux 12 (bookworm)

Deployment

Astronomer

What happened

DbtCloudRunJobTrigger.run() has two bugs that surface when a dbt Cloud job run exceeds the configured timeout:

Bug 1: Error message prints an epoch timestamp instead of elapsed duration

When the timeout fires, the error message reads:

Job run 70403194297680 has not reached a terminal status after 1771200015.8154984 seconds.

1771200015.8 seconds ≈ 56 years, which is obviously wrong. The value 1771200015.8 is actually the Unix epoch timestamp (2026-02-16 00:00:15 UTC), not a duration.

This happens because self.end_time is an absolute timestamp (time.time() + timeout), but the error message formats it as if it were a relative duration:

f"Job run {self.run_id} has not reached a terminal status after "
f"{self.end_time} seconds."

Bug 2: Timeout check fires without a final status poll

The run() loop structure is:

while await self.is_still_running(hook):   # status check → still running
    if self.end_time < time.time():        # timeout expired → yield error immediately
        ...
    await asyncio.sleep(self.poll_interval)

If the job completes during asyncio.sleep(self.poll_interval) and the timeout also expires during that sleep, the trigger yields a timeout error without re-checking the job status. This can cause the Airflow task to fail even though the dbt Cloud job completed successfully.

What you think should happen instead

  1. The error message should display the actual elapsed duration (or the configured timeout value), not the raw epoch timestamp.
  2. Before yielding a timeout error, the trigger should perform one final is_still_running() check so that jobs completing near the timeout boundary are not incorrectly reported as timed out.

How to reproduce

  1. Configure a DbtCloudRunJobOperator with deferrable=True and timeout=60 (1 minute).
  2. Trigger a dbt Cloud job that takes longer than 60 seconds but completes successfully.
  3. Observe:
    • The Airflow task fails with a DbtCloudJobRunException.
    • The error message reports a nonsensical number of seconds (a Unix epoch timestamp).

Suggested fix

In providers/dbt/cloud/src/airflow/providers/dbt/cloud/triggers/dbt.py, the run() method should:

async def run(self) -> AsyncIterator[TriggerEvent]:
    hook = DbtCloudHook(self.conn_id, **self.hook_params)
    try:
        while await self.is_still_running(hook):
            await asyncio.sleep(self.poll_interval)
            if self.end_time < time.time():
                # Final status check before declaring timeout
                if await self.is_still_running(hook):
                    yield TriggerEvent(
                        {
                            "status": "error",
                            "message": f"Job run {self.run_id} has not reached a terminal status "
                            f"within the configured timeout.",
                            "run_id": self.run_id,
                        }
                    )
                    return
                break  # Job completed — fall through to status handling
        ...

Anything else

Related issue: #61297 (another bug in the same provider where wait_for_job_run_status silently succeeds on terminal failure).

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions