fix: split pod controller from workflow controller #14129

Joibel · 2025-01-27T16:34:59Z

Superceeds #14088

Motivation

Split the pod (formerly pod GC) controller from the main workflow controller to make it easier to reason about.

Attempt to handle various conditions whereby pods can become orphaned when pod finalizers are in use.

Modifications

Split the pod controller components and its queue into the workflows/controller/pod subdirectory.

Callbacks into the workflow controller are provided so that pod events can enqueue workflows for reconciliation.

The pod controller can also now make decisions on it's own. This is especially important for orphaned pods where the workflow has undergone deletion without the pod finalizers being removed. The most obvious case here is a PodGC strategy with delete delay duration. In this case the only place the existence of the pod enqueued for deletion is in the pod GC delayed queue, which is in memory and subject to a workflow-controller restart losing track of it.

Added a number of testing hooks so that the workflow/controller tests didn't need rewriting where they need to inspect the internals of the pod-controller.

Verification

Existing E2E and unit testing, and a few extensions.

Documentation

This is mostly a refactor and amendment of code to accommodate more configurations.

Signed-off-by: Alan Clucas <alan@clucas.org>

isubasinghe · 2025-01-27T23:47:12Z

workflow/controller/controller_test.go

 		go wfc.wfInformer.Run(ctx.Done())
 		go wfc.wftmplInformer.Informer().Run(ctx.Done())
-		go wfc.podInformer.Run(ctx.Done())
+		go wfc.PodController.Run(ctx, 0)


I presume this was done to ensure that the pods don't get processed by the workers, so that you may reliably inspect the queue?

By "this" I assume you mean the 0 for number of workers. Yes.

In the same way the old code created a queue but never actually ran any workers, this means the tests have to manually do processNextItem so they can evaluate the result and inspect.

I'll add a comment.

isubasinghe

Looks good but I have some questions/issues

isubasinghe · 2025-01-28T01:46:11Z

workflow/controller/pod/controller.go

+			},
+		},
+	)
+	go podController.podInformer.Run(ctx.Done())


Should this not move to the Run method?

Good point, I'll move it.

isubasinghe · 2025-01-28T03:26:51Z

workflow/controller/pod/controller.go

+func (c *Controller) commonPodEvent(pod *v1.Pod, deleting bool) {
+	// All pods here are not marked completed
+	action := noAction
+	minimumDelay := time.Duration(0)
+	podGC := podGCFromPod(pod)
+	switch {
+	case deleting:
+		if hasOurFinalizer(pod.Finalizers) {
+			c.log.WithFields(logrus.Fields{"pod.Finalizers": pod.Finalizers}).Info("Removing finalizers during a delete")
+			action = removeFinalizer
+			minimumDelay = time.Duration(2 * time.Minute)
+		}
+	case c.podOrphaned(pod):
+		if hasOurFinalizer(pod.Finalizers) {
+			action = removeFinalizer
+		}
+		switch {
+		case podGC.Strategy == wfv1.PodGCOnWorkflowCompletion:
+		case podGC.Strategy == wfv1.PodGCOnPodCompletion:
+		case podGC.Strategy == wfv1.PodGCOnPodSuccess && pod.Status.Phase == apiv1.PodSucceeded:
+			action = deletePod
+		}
+	}
+	if action != noAction {
+		// The workflow is gone, we have no idea when that happened, so lets base around pod transiution
+		lastTransition := podLastTransition(pod)
+		// GetDeleteDelayDuration returns -1 if no duration, we don't care about failure to parse otherwise here
+		delay := time.Duration(0)
+		delayDuration, _ := podGC.GetDeleteDelayDuration()
+		// In the case of a raw delete make sure we've had some time to process it if there was a finalizer
+		if delayDuration < minimumDelay {
+			delayDuration = minimumDelay
+		}
+		if !lastTransition.IsZero() && delayDuration > 0 {
+			delay = time.Until(lastTransition.Add(delayDuration))
+		}
+		c.log.WithFields(logrus.Fields{"action": action, "namespace": pod.Namespace, "podName": pod.Name, "podGC": podGC}).Info("queuing pod", "delay", delay)
+		switch {
+		case delay > 0:
+			c.queuePodForCleanupAfter(pod.Namespace, pod.Name, action, delay)
+		default:
+			c.queuePodForCleanup(pod.Namespace, pod.Name, action)
+		}


I don't know how I feel about this code and queuePodsForCleanup being relatively similar.
More in the sense that they both drive the reconciliation of pods but live in different packages.

We discussed this on a call.

queuePodsForCleanup is the call into the pod-controller to reduce the kubernetes load and make life saner.
Thoughts:

We must await the node status update to be successfully written to the workflow. This eliminates pure pod observation to know when we can do anything to them, unless we add a lot of extra load to k8s by writing something to the pods so the pod-controller can see the change.

An option: The workflow informer is also observed by the pod-controller and it then unwraps the workflows to observe node statuses. This is costly in terms of CPU, and may also involve getting this information from the status offload database. Avoided for the double dip into workflow update events as we already know at this point we need to do a thing, and it's unclear that this is much less evil.

queuePodsForCleanup know which pods are in need of cleanup (by doing the naughty list operation) because they are still in the informer (e.g. not marked complete or deleted) thus minimizing the number needing checking. If we sent this event via a queue into the pod-controller we would have to send the entire set of node statuses and teach the pod-informer to understand node to pod mapping.

isubasinghe · 2025-01-28T03:54:38Z

workflow/controller/pod/controller.go

+	labelSelector := labels.NewSelector().
+		Add(*workflowReq).
+		// not sure if we should do this
+		Add(*incompleteReq).


I think we should remove this personally, the controllers job is to manage all pods created by argo.
People upgrading into this version might have trouble with pods not being handled with this remaining the same.

There is no path whereby the pods have the finalizer and are not labelled as completed. In this state they drop out of view of the informer, yes. Monitoring pods that are marked as completed is a waste of resources, owner reference to the workflow does all the necessary work.

Signed-off-by: Alan Clucas <alan@clucas.org>

isubasinghe

Just a small comment, good to go after thats been addressed.

isubasinghe · 2025-02-11T09:09:26Z

workflow/controller/pod/controller.go

+// // log something after calling this function maybe?
+// func podTerminating(pod *v1.Pod) bool {
+// 	return pod.DeletionTimestamp != nil
+// }


This needs to be removed.

Signed-off-by: Alan Clucas <alan@clucas.org>

Signed-off-by: Alan Clucas <alan@clucas.org> (cherry picked from commit 6c5608e)

…3.6) (#14263)

…abilities fixes (Cr 28355) (#358) * fix: bump deps for k8schain to fix ecr-login (argoproj#14008) (release-3.6 cherry-pick) (argoproj#14174) * fix(ci): python sdk release process (release-3.6) (argoproj#14183) Signed-off-by: Alan Clucas <alan@clucas.org> * docs: clarify qps/burst on controller (cherry-pick argoproj#14190) (argoproj#14192) Signed-off-by: Tim Collins <tim@thecollins.team> Co-authored-by: Tim Collins <45351296+tico24@users.noreply.github.com> * fix(api/jsonschema): use unchanging JSON Schema version (cherry-pick argoproj#14092) (argoproj#14256) Signed-off-by: Roger Peppe <rogpeppe@gmail.com> Co-authored-by: Roger Peppe <rogpeppe@gmail.com> * fix(api/jsonschema): use working `$id` (cherry-pick argoproj#14257) (argoproj#14258) Signed-off-by: Roger Peppe <rogpeppe@gmail.com> Co-authored-by: Roger Peppe <rogpeppe@gmail.com> * docs: autogenerate tested k8s versions and centralize config (argoproj#14176) (release-3.6) (argoproj#14262) Signed-off-by: Mason Malone <651224+MasonM@users.noreply.github.com> Signed-off-by: Alan Clucas <alan@clucas.org> Co-authored-by: Mason Malone <651224+MasonM@users.noreply.github.com> * chore(deps): bump minio-go to newer version (argoproj#14185) (release-3.6) (argoproj#14261) Co-authored-by: Vaibhav Kaushik <vaibhavkaushik@salesforce.com> * fix: split pod controller from workflow controller (argoproj#14129) (release-3.6) (argoproj#14263) * chore(deps): fix snyk (argoproj#14264) (release-3.6) (argoproj#14268) * chore: revert to correct k8s versions Accidental bump from argoproj#14176 cherry-pick Signed-off-by: Alan Clucas <alan@clucas.org> * chore(deps): bump github.com/go-jose/go-jose/v3 from 3.0.3 to 3.0.4 in the go_modules group (cherry-pick argoproj#14231) (argoproj#14269) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * fix: wait for workflow informer to sync before pod informer (cherry-pick argoproj#14248) (argoproj#14266) Signed-off-by: Rohan K <rohankmr414@gmail.com> Co-authored-by: Rohan K <rohankmr414@gmail.com> * fix(cli): remove red from log colour selection. Fixes argoproj#6740 (cherry-pick argoproj#14215) (argoproj#14278) Signed-off-by: Prabakaran Kumaresshan <4676330+nixphix@users.noreply.github.com> Co-authored-by: Prabakaran Kumaresshan <4676330+nixphix@users.noreply.github.com> * fix: correct semaphore configmap keys for multiple semaphores (argoproj#14184) (release-3.6) (argoproj#14281) * fix: don't print help for non-validation errors. Fixes argoproj#14234 (cherry-pick argoproj#14249) (argoproj#14283) Signed-off-by: Koichi Shimada <jumpe1programming@gmail.com> Signed-off-by: Mason Malone <651224+MasonM@users.noreply.github.com> Co-authored-by: koichi <51446844+jumpe1@users.noreply.github.com> Co-authored-by: Mason Malone <651224+MasonM@users.noreply.github.com> * docs: fix kubernetes versions (release-3.6) (argoproj#14273) Signed-off-by: Alan Clucas <alan@clucas.org> * fix(workflow/sync): use RWMutex to prevent concurrent map access (cherry-pick argoproj#14321) (argoproj#14322) Signed-off-by: Ryan Currah <ryan@currah.ca> Co-authored-by: Ryan Currah <ryan@currah.ca> * chore(lint): update golangci-lint to 2.1.1 (argoproj#14390) (cherry-pick release-3.6) (argoproj#14417) * chore: bump golang 1.23->1.24 (argoproj#14385) (cherry-pick release-3.6) (argoproj#14418) * fix: gracefully handle invalid CronWorkflows and simplify logic. (cherry-pick argoproj#14197) (argoproj#14419) Signed-off-by: Mason Malone <651224+MasonM@users.noreply.github.com> * fix: prevent dfs sorter infinite recursion on cycle. Fixes argoproj#13395 (cherry-pick argoproj#14391) (argoproj#14420) Signed-off-by: Adrien Delannoy <a.delannoyfr@gmail.com> Co-authored-by: Adrien Delannoy <a.delannoyfr@gmail.com> * chore(deps): bump github.com/expr-lang/expr from 1.16.9 to 1.17.0 (argoproj#14307) (release-3.6) (argoproj#14421) Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps)!: update k8s and argo-events (release-3.6) (argoproj#14424) Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: william.vanhevelingen <william.vanhevelingen@acquia.com> Signed-off-by: Mason Malone <651224+MasonM@users.noreply.github.com> Signed-off-by: William Van Hevelingen <william.vanhevelingen@acquia.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: William Van Hevelingen <William.VanHevelingen@acquia.com> Co-authored-by: Mason Malone <651224+MasonM@users.noreply.github.com> * fix: correct retry logic (argoproj#13734) (release-3.6) (argoproj#14428) Signed-off-by: isubasinghe <isitha@pipekit.io> Signed-off-by: Alan Clucas <alan@clucas.org> Co-authored-by: Isitha Subasinghe <isitha@pipekit.io> * fix: manual retries exit handler cleanup. Fixes argoproj#14180 (argoproj#14181) (release-3.6) (argoproj#14429) Signed-off-by: isubasinghe <isitha@pipekit.io> Signed-off-by: Alan Clucas <alan@clucas.org> Co-authored-by: Isitha Subasinghe <isitha@pipekit.io> * fix: correct manual retry logic. Fixes argoproj#14124 (argoproj#14328) (release-3.6) (argoproj#14430) Signed-off-by: oninowang <oninowang@tencent.com> Signed-off-by: Alan Clucas <alan@clucas.org> Co-authored-by: jswxstw <jswxstw@gmail.com> * fix: disable ALPN in argo-server as a workaround (argoproj#14433) Signed-off-by: Alan Clucas <alan@clucas.org> * result of codegen Signed-off-by: Kim <kim.aharfi@codefresh.io> * fix:lint Signed-off-by: Kim <kim.aharfi@codefresh.io> --------- Signed-off-by: Alan Clucas <alan@clucas.org> Signed-off-by: Tim Collins <tim@thecollins.team> Signed-off-by: Roger Peppe <rogpeppe@gmail.com> Signed-off-by: Mason Malone <651224+MasonM@users.noreply.github.com> Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Rohan K <rohankmr414@gmail.com> Signed-off-by: Prabakaran Kumaresshan <4676330+nixphix@users.noreply.github.com> Signed-off-by: Koichi Shimada <jumpe1programming@gmail.com> Signed-off-by: Ryan Currah <ryan@currah.ca> Signed-off-by: Adrien Delannoy <a.delannoyfr@gmail.com> Signed-off-by: william.vanhevelingen <william.vanhevelingen@acquia.com> Signed-off-by: William Van Hevelingen <william.vanhevelingen@acquia.com> Signed-off-by: isubasinghe <isitha@pipekit.io> Signed-off-by: oninowang <oninowang@tencent.com> Signed-off-by: Kim <kim.aharfi@codefresh.io> Co-authored-by: Alan Clucas <alan@clucas.org> Co-authored-by: gcp-cherry-pick-bot[bot] <98988430+gcp-cherry-pick-bot[bot]@users.noreply.github.com> Co-authored-by: Tim Collins <45351296+tico24@users.noreply.github.com> Co-authored-by: Roger Peppe <rogpeppe@gmail.com> Co-authored-by: Mason Malone <651224+MasonM@users.noreply.github.com> Co-authored-by: Vaibhav Kaushik <vaibhavkaushik@salesforce.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Rohan K <rohankmr414@gmail.com> Co-authored-by: Prabakaran Kumaresshan <4676330+nixphix@users.noreply.github.com> Co-authored-by: koichi <51446844+jumpe1@users.noreply.github.com> Co-authored-by: Ryan Currah <ryan@currah.ca> Co-authored-by: Adrien Delannoy <a.delannoyfr@gmail.com> Co-authored-by: William Van Hevelingen <William.VanHevelingen@acquia.com> Co-authored-by: Isitha Subasinghe <isitha@pipekit.io> Co-authored-by: jswxstw <jswxstw@gmail.com>

fix: split pod controller from workflow controller

0bb6591

Signed-off-by: Alan Clucas <alan@clucas.org>

Joibel requested a review from isubasinghe January 27, 2025 16:40

Merge branch 'main' into pod-controller

7c15736

isubasinghe reviewed Jan 27, 2025

View reviewed changes

isubasinghe reviewed Jan 28, 2025

View reviewed changes

Joibel added 2 commits January 28, 2025 11:47

chore: address feedback and lint

42b0c70

Signed-off-by: Alan Clucas <alan@clucas.org>

chore: missed a bit

d1e657d

Signed-off-by: Alan Clucas <alan@clucas.org>

Joibel marked this pull request as ready for review January 29, 2025 15:29

Joibel mentioned this pull request Feb 6, 2025

fix: pod finalizer removal and odd pod status #14088

Closed

isubasinghe approved these changes Feb 11, 2025

View reviewed changes

chore: remove commented out code

9086449

Signed-off-by: Alan Clucas <alan@clucas.org>

Joibel enabled auto-merge (squash) February 13, 2025 10:56

Joibel merged commit 6c5608e into argoproj:main Feb 13, 2025
31 checks passed

rohankmr414 mentioned this pull request Mar 4, 2025

fix: wait for workflow informer to sync before pod informer #14248

Merged

Joibel mentioned this pull request Mar 6, 2025

Release v3.6 patch releases discussion #14259

Open

Joibel added a commit that referenced this pull request Mar 6, 2025

fix: split pod controller from workflow controller (#14129)

3e100c3

Signed-off-by: Alan Clucas <alan@clucas.org> (cherry picked from commit 6c5608e)

Joibel added a commit that referenced this pull request Mar 6, 2025

fix: split pod controller from workflow controller (#14129) (release-…

96181d3

…3.6) (#14263)

fix: split pod controller from workflow controller #14129

fix: split pod controller from workflow controller #14129

Uh oh!

Conversation

Joibel commented Jan 27, 2025

Motivation

Modifications

Verification

Documentation

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

isubasinghe left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

isubasinghe left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants