Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Allow to specify grace period for pod GC #5033

Merged
merged 3 commits into from
Feb 9, 2021
Merged

feat: Allow to specify grace period for pod GC #5033

merged 3 commits into from
Feb 9, 2021

Conversation

terrytangyuan
Copy link
Member

@terrytangyuan terrytangyuan commented Feb 5, 2021

Use case: we need some grace period to allow other services to complete the pod information collection (e.g. log and db persistence), especially during high load where those services have certain amount of delays.

Checklist:

@terrytangyuan
Copy link
Member Author

cc @stefansedich who's looking for this feature as well.

@stefansedich
Copy link
Contributor

stefansedich commented Feb 5, 2021

cc @stefansedich who's looking for this feature as well.

Love your work! Question however will this do what we want? If a pod shuts down right away the grace period won't help right? I believe this handles the time between pod stop and force kill.

@terrytangyuan
Copy link
Member Author

terrytangyuan commented Feb 5, 2021

A pod is added to the podCleanupQueue when it meets the podGCStrategy. This grace period is the time to wait before the pod in the queue gets deleted.

@alexec alexec self-assigned this Feb 8, 2021
config/config.go Outdated
@@ -92,6 +92,10 @@ type Config struct {
// PodSpecLogStrategy enables the logging of podspec on controller log.
PodSpecLogStrategy PodSpecLogStrategy `json:"podSpecLogStrategy,omitempty"`

// PodGCGracePeriodSeconds specifies the duration in seconds before the pods in the GC queue get deleted.
// Value must be non-negative integer. Defaults to zero, which indicates delete immediately.
PodGCGracePeriodSeconds int64 `json:"podGCGracePeriodSeconds,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uint64 allowed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated this to be *int64 to be consistent with the type of DeleteOptions.GracePeriodSeconds.

err := pods.Delete(ctx, podName, metav1.DeleteOptions{PropagationPolicy: &propagation})
err := pods.Delete(ctx, podName, metav1.DeleteOptions{
PropagationPolicy: &propagation,
GracePeriodSeconds: pointer.Int64Ptr(wfc.Config.PodGCGracePeriodSeconds)})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is 30s by default, so presumably, you'll make this longer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's necessary to make this longer for certain scenarios.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@terrytangyuan @alexec I am still trying to understand how this change waits before deleing pods, this call here deletes the pod setting the grace-period-seconds, which as far as I understand what will happen:

  1. SIGTERM is sent to container
  2. SIGKILL is sent if container does not gracefully shutdown within the grace-period

If my container exits immediately after the SIGTERM or in this case is not even running as it is completed how is the grace period helping to delay it's deletion?

Copy link
Member

@agilgur5 agilgur5 Oct 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For posterity, the above was resolved and implemented in #6168

Signed-off-by: terrytangyuan <terrytangyuan@gmail.com>
Signed-off-by: terrytangyuan <terrytangyuan@gmail.com>
Signed-off-by: terrytangyuan <terrytangyuan@gmail.com>
@alexec alexec merged commit daf1a71 into argoproj:master Feb 9, 2021
@alexec alexec added this to the v3.0 milestone Feb 9, 2021
@terrytangyuan terrytangyuan deleted the gc-strategy-graceperiod branch February 9, 2021 21:31
@simster7 simster7 mentioned this pull request Feb 16, 2021
33 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants