Skip to content

Scheduler heartbeat warning message in Airflow UI displaying that scheduler is down sometimes incorrect #28118

@kosteev

Description

@kosteev

Apache Airflow version

main (development)

What happened

Steps to reproduce:

  1. run 2 replicas of scheduler
  2. initiate shut down of one of the schedulers
  3. In Airflow UI observe message

image

3rd step should be done immediately after 2nd (refreshing UI page few times). 2nd and 3rd steps might be repeated for couple of times in order to reproduce.

What you think should happen instead

Warning message shouldn't be displayed.

The issue is that for this warning message recent (with latest heartbet) scheduler job is fetched

return session.query(cls).order_by(cls.latest_heartbeat.desc()).limit(1).first()
.

And this may point to job which is not running (state!="running") and that is why we see warning message.
The warning message in this case is misleading as another replica of scheduler is running in parallel.

How to reproduce

No response

Operating System

Linux

Versions of Apache Airflow Providers

No response

Deployment

Composer

Deployment details

No response

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions