Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Statefulset OnDelete Limitation and Enhancement #744

Open
mohammadkhavari opened this issue Sep 3, 2023 · 4 comments
Open

Statefulset OnDelete Limitation and Enhancement #744

mohammadkhavari opened this issue Sep 3, 2023 · 4 comments
Assignees
Labels
question Further information is requested

Comments

@mohammadkhavari
Copy link
Contributor

While using vmcluster, rolling and updating vmstorage and vmselect we can't actually use OnDelete policy, cause when updateStrategy is OnDelete, victoriametrics operator will delete sts pods accordingly, this feature was added due to this issue. #344
the problem here is when we apply a minor change in vmstorage configurations operator will delete vmstorage pods in order and vminserts will start rerouting, all storage nodes will face pending indexdb items as soon as new timeseries arrive, and this leads to metrics loss in the pipeline, our vmcluster can tolerate one storage node downtime without metric loss, but multiple storage node downtime is troublesome.
Is there any way that we can actually use OnDelete Strategy on vmstorage statefulsets, or even a parameter like stsPodsRollingDelay can be save us from downtime, a time that operator waits between recreating each vmstorage pod.
In fact we can turn off the operator and update statefulset and vmcluster objects and delete pods accordingly, but if we make a mistake in consistency between our manually edited sts and the sts that vmcluster will generate (that cause different revisions), by activating operator pods may recreated again.

@Haleygo Haleygo added the question Further information is requested label Sep 11, 2023
@Haleygo Haleygo self-assigned this Sep 11, 2023
@Haleygo
Copy link
Contributor

Haleygo commented Sep 13, 2023

Hello! That's a very good case to consider!

Is there any way that we can actually use OnDelete Strategy on vmstorage statefulsets

From the commit, operator will take over the rolling update process if rollingUpdateStrategy=OnDelete, that should cover some corner cases, so I don't think it's recommended to use the real OnDelete Strategy.

or even a parameter like stsPodsRollingDelay can be save us from downtime, a time that operator waits between recreating each vmstorage pod.

Since both OnDelete and RollingUpdate wait pod to be ready to delete the next one, I'm thinking you can increase successThreshold or periodSeconds under vmstorage's readinessProbe, that should prolong the pod ready duration and let vmstorage cluster to be more stable. For example

   readinessProbe:
      failureThreshold: 10
      httpGet:
        path: /health
        port: 8482
        scheme: HTTP
      periodSeconds: 10
      successThreshold: 6 // under this setting, vmstorage pod will wait at least 60s to become ready, but it will start working before that 
      timeoutSeconds: 5

@f41gh7
Copy link
Collaborator

f41gh7 commented Sep 26, 2023

I think, it's a global vmcluster issue. Changing rolling update strategy or increasing delays between component updates don't help much in this case.

VictoriaMetrics/VictoriaMetrics#4922

@Haleygo
Copy link
Contributor

Haleygo commented Nov 22, 2023

Hey @mohammadkhavari,
Starting from v1.95.0, you can set -storage.vminsertConnsShutdownDuration command-line flag at vmstorage to gradual close vminsert connections during vmstorage graceful shutdown, see these docs for more details.

@dctrwatson
Copy link

dctrwatson commented Feb 16, 2024

storage.vminsertConnsShutdownDuration has no effect (when longer than 25s) when using OnDelete strategy because the operator has a hard-coded 30s timeout:

err := rclient.Delete(ctx, &pod, &client.DeleteOptions{GracePeriodSeconds: pointer.Int64Ptr(30)})

Instead, the deletion grace period should probably use TerminationGracePeriodSeconds instead. Since a user will need to properly set that too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants