-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Fix n+1 queries in ui/dags endpoint and use a compact response model. #57425
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…t trigger n+1 queries to other tables.
pierrejeambrun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good idea, we were probably missing some loading options in the queries, and if we don't need them.
Just one question regarding backward compatibility, otherwise LGTM.
|
What I plan to do, and we could do here is add some |
bbovenzi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catches!
…apache#57425) * Use DAGRunLightResponse to stop loading unused fields in response that trigger n+1 queries to other tables. * Update frontend type and tests. * Fix MyPy error.
The recent dag runs query was triggering access to task_instances, task_instance_histories, deadlines and dag_run_note tables as part of
model_validateusingDagRunResponseand each attribute access caused a query resulting n+1 queries for each dagrun in the recent runs response. The recent runs response included a lot of fields that are unused in the UI. This PR simplifies the response to selectively query the fields and usesDAGRunLightResponseto includedurationfield which fulfills all the fields required in the frontend.Additionally, the
pending_actionsfilter to filter by required actions had condition ashas_pending_actions.value is not Falseand in case of normal dags list page the has_pending_actions.value is None and thusNone is not Falsealso lead to evaluating query where it was not required. This also improved the performance.With the removal of n+1 queries and unused fields the overall response was significantly improved up to 7-8x where page that took 1s second to load for 15 dags and 2100 dagruns now loaded in around 100ms. This is also more noticeable in large environments where the n+1 queries are not executed. The http response size was also reduced with removal of unused fields in the response.
pydantic/pydantic#8192
sqlalchemy/sqlalchemy#10120
https://docs.sqlalchemy.org/en/20/orm/queryguide/relationships.html#sqlalchemy.orm.noload
Closes #57418