fix/clusterPodStatuses: only process `when` conditional if specified #1088

diamonwiggins · 2023-03-31T14:54:59Z

Description, Motivation and Context

Currently the when conditional for the clusterPodStatuses analyzer is expected for all outcome types and if not specified the following behavior is seen using the following spec:

apiVersion: troubleshoot.sh/v1beta2
kind: Preflight
metadata:
  name: kurl-builtin-oncluster
spec:
  collectors:
    - clusterResources: {}
    - clusterInfo: {}
  analyzers:
    - clusterPodStatuses:
        checkName: "Pod(s) healthy"
        outcomes:
          - warn:
              when: "!= Healthy" # Catch all unhealthy pods. A pod is considered healthy if it has a status of Completed, or Running and all of its containers are ready.
              message: "A Pod, {{ .Name }}, is unhealthy with a status of: {{ .Status.Reason }}. Restarting the pod may fix the issue."
          - pass:
              message: "All Pods are OK."

Because when isn't set for pass you see the above errors. With the changes in this PR, when is optional and the expected pattern of having a default outcome at the end of your outcome list is supported.

Fixes: #1087

Checklist

New and existing tests pass locally with introduced changes.
Tests for the changes have been added (for bug fixes / features)
The commit message(s) are informative and highlight any breaking changes
Any documentation required has been added/updated. For changes to https://troubleshoot.sh/ create a PR here

Does this PR introduce a breaking change?

Yes
No

camilamacedo86

@diamonwiggins that is great. Could we try to return only one result when pods are Healthy instead of a list with all Healthy pods on the cluster? Could we just return the list of those that are unhealthy?

diamonwiggins · 2023-03-31T17:32:49Z

@diamonwiggins that is great. Could we try to return only one result when pods are Healthy instead of a list with all Healthy pods on the cluster? Could we just return the list of those that are unhealthy?

I think that's a good idea. Do you think we need to cover that here on this PR?

camilamacedo86

Return many times the same success result seems a bug out of scope of this PR. See: #1089

I think would be nice it has tests but I am so happy with the fix already.
Thank you 👍

camilamacedo86

Terrific work 🥇 I added only one nit.
IF possible would be great have the mocks under a directory called testdata to follow the best practices with Golang to keep those. Otherwise, it has my LGTM

camilamacedo86 · 2023-04-01T11:35:40Z

pkg/analyze/files/pods/default-unhealthy.json

@@ -0,0 +1,263 @@
+{


Could we add it inside of a directory called testdata?
Why Golang has a particular meaning for those and for we add mock data that is the best approach.

+1

Specifically https://github.com/replicatedhq/troubleshoot/tree/main/testdata

[evans] $ go help test | grep -A1 testdata The go tool will ignore a directory named "testdata", making it available to hold ancillary data needed by the tests.

But it can be done in a separate PR. There are many other test data files in the current directory used

banjoh · 2023-04-03T11:13:35Z

@diamonwiggins that is great. Could we try to return only one result when pods are Healthy instead of a list with all Healthy pods on the cluster? Could we just return the list of those that are unhealthy?

Looking at other status analysers (Deployment, StatefulSet...), it looks like the intention is to show the analysis of each resource. I however see your concern here @camilamacedo86. We can improve how these analysis results are reported back by adding a title to each outcome like so. There is a bug (#1093) that needs to be fixed first before this is possible

  analyzers:
    - clusterPodStatuses:
        checkName: "Pod(s) health status(es)"
        outcomes:
          - fail:
              title: "Pod {{ .Name }} is unable to pull images"
              when: "== ImagePullBackOff"
              message: "A Pod, {{ .Name }}, is unable to pull its image. Status is: {{ .Status.Reason }}"
          - warn:
              title: "Pod {{ .Name }} is unhealthy"
              when: "!= Healthy"
              message: "A Pod, {{ .Name }}, is unhealthy with a status of: {{ .Status.Reason }}. Restarting the pod may fix the issue."
          - pass:
              title: "Pod {{ .Name }} is healthy"
              message: "Pod {{ .Name }} is healthy"

This would render results like below

banjoh · 2023-04-03T11:19:46Z

Could we just return the list of those that are unhealthy?

@camilamacedo86 I think if we ignore the pass condition, all the healthy ones are not shown in the analysis outcomes

banjoh · 2023-04-03T11:30:58Z

pkg/analyze/cluster_pod_statuses_test.go

@@ -0,0 +1,266 @@
+package analyzer
+
+import (


Good stuff with the tests!

Nit pick: Could you add/modify tests to use !==, ==, === and = conditionals for completeness? e.g === CrashLoopBackOff

banjoh · 2023-04-03T11:37:03Z

pkg/analyze/cluster_pod_statuses.go

+			if when != "" {
+				parts := strings.Split(strings.TrimSpace(when), " ")
+				if len(parts) < 2 {
+					println(fmt.Sprintf("invalid 'when' format: %s\n", when)) // don't stop


Whilst here, shall we use klog.Errorf instead? There is another println somewhere in the file

…eplicatedhq/troubleshoot into diamonwiggins/fix-cluster-pod-analyzer

only process when conditional if specified

070cbea

diamonwiggins added type::bug Something isn't working bug::normal labels Mar 31, 2023

diamonwiggins requested a review from a team as a code owner March 31, 2023 14:55

camilamacedo86 reviewed Mar 31, 2023

View reviewed changes

camilamacedo86 previously approved these changes Mar 31, 2023

View reviewed changes

adding tests for cluster pod status analyzer

c32858f

diamonwiggins dismissed camilamacedo86’s stale review via c32858f March 31, 2023 20:15

diamonwiggins requested review from camilamacedo86 and banjoh March 31, 2023 20:17

camilamacedo86 approved these changes Apr 1, 2023

View reviewed changes

Merge branch 'main' into diamonwiggins/fix-cluster-pod-analyzer

e2c011b

camilamacedo86 previously approved these changes Apr 1, 2023

View reviewed changes

camilamacedo86 reviewed Apr 1, 2023

View reviewed changes

banjoh previously approved these changes Apr 3, 2023

View reviewed changes

diamonwiggins added 3 commits April 5, 2023 09:21

use klog instead of fmt for logging

5e8ce26

add additional tests for warn and more operators

ab8d682

Merge branch 'diamonwiggins/fix-cluster-pod-analyzer' of github.com:r…

bb1257a

…eplicatedhq/troubleshoot into diamonwiggins/fix-cluster-pod-analyzer

diamonwiggins dismissed stale reviews from banjoh and camilamacedo86 via bb1257a April 5, 2023 13:39

diamonwiggins requested a review from banjoh April 5, 2023 13:40

banjoh approved these changes Apr 6, 2023

View reviewed changes

banjoh merged commit 9a457f7 into main Apr 6, 2023

banjoh deleted the diamonwiggins/fix-cluster-pod-analyzer branch April 6, 2023 08:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix/clusterPodStatuses: only process `when` conditional if specified #1088

fix/clusterPodStatuses: only process `when` conditional if specified #1088

diamonwiggins commented Mar 31, 2023

camilamacedo86 left a comment

diamonwiggins commented Mar 31, 2023

camilamacedo86 left a comment

camilamacedo86 left a comment •

edited

Loading

camilamacedo86 Apr 1, 2023

banjoh Apr 3, 2023 •

edited

Loading

banjoh commented Apr 3, 2023 •

edited

Loading

banjoh commented Apr 3, 2023 •

edited

Loading

banjoh Apr 3, 2023

banjoh Apr 3, 2023

fix/clusterPodStatuses: only process when conditional if specified #1088

fix/clusterPodStatuses: only process when conditional if specified #1088

Conversation

diamonwiggins commented Mar 31, 2023

Description, Motivation and Context

Checklist

Does this PR introduce a breaking change?

camilamacedo86 left a comment

Choose a reason for hiding this comment

diamonwiggins commented Mar 31, 2023

camilamacedo86 left a comment

Choose a reason for hiding this comment

camilamacedo86 left a comment • edited Loading

Choose a reason for hiding this comment

camilamacedo86 Apr 1, 2023

Choose a reason for hiding this comment

banjoh Apr 3, 2023 • edited Loading

Choose a reason for hiding this comment

banjoh commented Apr 3, 2023 • edited Loading

banjoh commented Apr 3, 2023 • edited Loading

banjoh Apr 3, 2023

Choose a reason for hiding this comment

banjoh Apr 3, 2023

Choose a reason for hiding this comment

fix/clusterPodStatuses: only process `when` conditional if specified #1088

fix/clusterPodStatuses: only process `when` conditional if specified #1088

camilamacedo86 left a comment •

edited

Loading

banjoh Apr 3, 2023 •

edited

Loading

banjoh commented Apr 3, 2023 •

edited

Loading

banjoh commented Apr 3, 2023 •

edited

Loading