Skip to content

Conversation

@insomnes
Copy link
Contributor

@insomnes insomnes commented Feb 15, 2025

This PR intends to inform the user that they can resolve the backend container stuck in an unhealthy state by running the breeze down command.

  • Add unhealthy check for backend container with console notification message via docker inspect;
  • Do not run a check on not postgres / mysql backends or in CI;
  • Default to not show the notification in case of doubt or any error;
  • Use the check for _run_test and enter_shell routines;

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@insomnes
Copy link
Contributor Author

Background from Slack:

Yes breeze down is the way to go. The issue is that while we have “in-progress” work or when you switch branches - some of the migrations might or might not be valid, and your database might or might not be upgradeable/healthy. Breeze down cleans up the volumes used by the DB and generally make the DB reinitialize from scratch next time

It’s rather difficult to find out when you should clean-up the database (for example this might happen when you just rebase to latest version of airlfow or go back to previous version because your old PR was based on old version of airflow - but maybe someone could come with a proposal how to communicate it better in case this happens

@insomnes
Copy link
Contributor Author

insomnes commented Feb 15, 2025

There is a caveat for tests with --run-in-parallel: the notification would be printed separately for every parallel run.

@potiuk
Copy link
Member

potiuk commented Feb 15, 2025

There is a caveat for tests with --run-in-parallel: the notification would be printed separately for every parallel run.

That's not a problem at all. This is is very unlikely to happen as --run-in-parallel is mostly useful for CI and even if it is run locally, this is really a small issue to see the error multiple times.

@potiuk
Copy link
Member

potiuk commented Feb 15, 2025

Fantastic. Thanks @insomnes !

I tested it by modifying backend-mysql.yaml a bit and making health check fai:

Screenshot 2025-02-15 at 13 15 12

SELECT 1EEEEE fails, and decreasing the times for check make it fails fast :)

Result:

Screenshot 2025-02-15 at 13 14 10

@potiuk potiuk merged commit f93b465 into apache:main Feb 15, 2025
92 checks passed
dantonbertuol pushed a commit to dantonbertuol/airflow that referenced this pull request Feb 17, 2025
@insomnes insomnes deleted the notify-on-unhealthy-backend branch February 17, 2025 22:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants