Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Passing last_dag_run_state to recent dag runs endpoint returns dags where the dag has any dagrun equal to the last_dag_run_state #43882

Open
1 of 2 tasks
tirkarthi opened this issue Nov 11, 2024 · 0 comments
Labels
area:API Airflow's REST/HTTP API area:core kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet

Comments

@tirkarthi
Copy link
Contributor

Apache Airflow version

main (development)

If "Other Airflow 2 version" selected, which one?

No response

What happened?

Passing last_dag_run_state from the UI causes the API to return dags where any of the dagrun might have the state passed and not filtered on the last dag run state. E.g. A dag with one failure among 10 dagruns with last dagrun state as success will be still returned on passing last_dag_run_state=failed in the API.

This can be due to the below query where rank is calculated but on passing last_dag_run_state the row with the rank of 1 which is the latest dag run should be the one considered for filter. Instead the _LastDagRunStateFilter only considers state. I am not sure how rank from the subquery can be access here.

recent_runs_subquery = (
select(
DagRun.dag_id,
DagRun.execution_date,
func.rank()
.over(
partition_by=DagRun.dag_id,
order_by=DagRun.execution_date.desc(),
)
.label("rank"),
)
.order_by(DagRun.execution_date.desc())
.subquery()
)

class _LastDagRunStateFilter(BaseParam[DagRunState]):
"""Filter on the state of the latest DagRun."""
def to_orm(self, select: Select) -> Select:
if self.value is None and self.skip_none:
return select
return select.where(DagRun.state == self.value)
def depends(self, last_dag_run_state: DagRunState | None = None) -> _LastDagRunStateFilter:
return self.set_value(last_dag_run_state)

One idea was to filter by rank=1 and state when last_dag_run_state is passed but that would mean instead of n recent dag runs only 1 run which is the last run will be returned.

.join(DagModel, DagModel.dag_id == recent_runs_subquery.c.dag_id)

Get dags endpoint works though using CTE to return only correct dags :

https://github.com/apache/airflow/blob/main/airflow/api_fastapi/common/db/dags.py

This can be reproduced by the below test where in the fixture dag_1, dag_2 will have last dagrun state as success and dag_3 will have last dagrun state as failed but on passing "success"/"failed" the result remains the same.

({"last_dag_run_state": "success", "only_active": False}, [DAG1_ID, DAG2_ID, DAG3_ID], 6),
({"last_dag_run_state": "failed", "only_active": False}, [DAG1_ID, DAG2_ID, DAG3_ID], 9),

What you think should happen instead?

No response

How to reproduce

  1. Dag1 with 5 dagruns and last run as success but 3rd run as failed.
  2. Dag 2 with 5 dagruns and last run as failhttps://github.com/apache/airflow/blob/main/airflow/api_fastapi/common/db/dags.pyed.
  3. Go to http://localhost:8000/webapp/dags?last_dag_run_state=failed and in the recent dag runs API Dag1 is still returned but filtered out in the frontend.

Operating System

Ubuntu

Versions of Apache Airflow Providers

No response

Deployment

Other

Deployment details

No response

Anything else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@tirkarthi tirkarthi added kind:bug This is a clearly a bug area:core needs-triage label for new issues that we didn't triage yet labels Nov 11, 2024
@dosubot dosubot bot added the area:API Airflow's REST/HTTP API label Nov 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:API Airflow's REST/HTTP API area:core kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet
Projects
None yet
Development

No branches or pull requests

1 participant