Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use internal exponential backoff to avoid flapping on DB startup #8797

Open
krancour opened this issue May 18, 2022 · 1 comment
Open

use internal exponential backoff to avoid flapping on DB startup #8797

krancour opened this issue May 18, 2022 · 1 comment
Labels
area/workflow-archive solution/suggested A solution to the bug has been suggested. Someone needs to implement it. type/feature Feature request

Comments

@krancour
Copy link

Walking through the quickstart, I notice that both the argo-server and workflow controller "flap" while waiting for postgres to become available. On average (on my machine, at least), both components restart 3 times before coming up clean. This is by no means an out of the ordinary thing for k8s apps, however, if either of those components get too far into a crashloop backoff, the overall effect can be that the system as a whole takes longer than it ought to to come up clean.

I wanted to propose that it's fairly easy to implement an exponential backoff (with a low max backoff between retries) internally so that components don't "flap" like this while waiting for their own network-bound dependencies to be satisfied. Speaking from experience, this strategy can allow a system such as this one to start faster and smoother, as a whole.


Message from the maintainers:

Love this enhancement proposal? Give it a 👍. We prioritise the proposals with the most 👍.

@krancour krancour added the type/feature Feature request label May 18, 2022
@agilgur5 agilgur5 changed the title Proposal: use internal exponential backoff to avoid flapping on startup Proposal: use internal exponential backoff to avoid flapping on DB startup Oct 16, 2024
@agilgur5 agilgur5 changed the title Proposal: use internal exponential backoff to avoid flapping on DB startup use internal exponential backoff to avoid flapping on DB startup Oct 16, 2024
@agilgur5 agilgur5 added area/controller Controller issues, panics area/server labels Oct 18, 2024
@agilgur5
Copy link
Member

agilgur5 commented Oct 18, 2024

I think this can be done within CreateDBSession?

session, err := sqldb.CreateDBSession(wfc.kubeclientset, wfc.namespace, persistence)

using the existing Backoff function and potentially existing transient error detection (although the errors may be fairly different for DB connections and may be DB dependent as well)

@agilgur5 agilgur5 added solution/suggested A solution to the bug has been suggested. Someone needs to implement it. area/workflow-archive and removed area/controller Controller issues, panics area/server labels Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/workflow-archive solution/suggested A solution to the bug has been suggested. Someone needs to implement it. type/feature Feature request
Projects
None yet
Development

No branches or pull requests

2 participants