Skip to content

BUG: ClusterSummary status showing Provisioned when deployment failed #1596

@wahabmk

Description

@wahabmk

Problem Description

  1. Deployed an nginx helm chart using Profile with 1 replica (the Profile deployed via creating a ClusterDeployment in k0rdent):
apiVersion: config.projectsveltos.io/v1beta1
kind: Profile
metadata:
creationTimestamp: "2026-01-26T17:03:18Z"
finalizers:
- profilefinalizer.projectsveltos.io
generation: 3
labels:
  k0rdent.mirantis.com/managed: "true"
  projectsveltos.io/cluster-name: cdunkelb-standalone-m78rdg
  projectsveltos.io/cluster-type: Capi
  projectsveltos.io/profile-name: cdunkelb-standalone-m78rdg
name: cdunkelb-standalone-m78rdg
namespace: kcm-system
ownerReferences:
- apiVersion: k0rdent.mirantis.com/v1beta1
  blockOwnerDeletion: true
  controller: true
  kind: ServiceSet
  name: cdunkelb-standalone-m78rdg
  uid: eba38b73-d815-4850-bba3-8cc341b45f7a
spec:
clusterRefs:
- apiVersion: cluster.x-k8s.io/v1beta2
  kind: Cluster
  name: cdunkelb-standalone-m78rdg
  namespace: kcm-system
clusterSelector: {}
continueOnConflict: true
continueOnError: false
helmCharts:
- chartName: ingress-nginx
  chartVersion: 4.13.2
  helmChartAction: Install
  options:
    installOptions:
      createNamespace: true
      replace: true
  releaseName: ingress-nginx
  releaseNamespace: ingress-nginx
  repositoryName: ingress-nginx
  repositoryURL: oci://ghcr.io/k0rdent/catalog/charts
  values: |-
    controller:
      replicaCount: 1
  1. Updated the values to 2 replicas:
  values: |-
    controller:
      replicaCount: 2
  1. The ClusterSummary status shows Provisioned:
- apiVersion: config.projectsveltos.io/v1beta1
  kind: ClusterSummary
  metadata:
    creationTimestamp: "2026-01-26T18:00:57Z"
    finalizers:
    - clustersummaryfinalizer.projectsveltos.io
    generation: 3
    labels:
      k0rdent.mirantis.com/managed: "true"
      projectsveltos.io/cluster-name: cdunkelb-standalone-m78rdg
      projectsveltos.io/cluster-type: Capi
      projectsveltos.io/profile-name: cdunkelb-standalone-m78rdg
    name: p--cdunkelb-standalone-m78rdg-capi-cdunkelb-standalone-m78rdg
    namespace: kcm-system
    ownerReferences:
    - apiVersion: config.projectsveltos.io/v1beta1
      kind: Profile
      name: cdunkelb-standalone-m78rdg
      uid: 93a897f6-9c12-486c-9291-0cd1b6c802de
    resourceVersion: "96716"
    uid: ec030794-dccc-427e-8a24-73e739c4002e
  spec:
    clusterName: cdunkelb-standalone-m78rdg
    clusterNamespace: kcm-system
    clusterProfileSpec:
      clusterRefs:
      - apiVersion: cluster.x-k8s.io/v1beta2
        kind: Cluster
        name: cdunkelb-standalone-m78rdg
        namespace: kcm-system
      clusterSelector: {}
      continueOnConflict: true
      continueOnError: false
      helmCharts:
      - chartName: ingress-nginx
        chartVersion: 4.13.2
        helmChartAction: Install
        options:
          atomic: false
          dependencyUpdate: false
          disableHooks: false
          disableOpenAPIValidation: false
          enableClientCache: false
          installOptions:
            createNamespace: true
            disableHooks: false
            replace: true
          skipCRDs: false
          skipSchemaValidation: false
          uninstallOptions:
            disableHooks: false
          upgradeOptions:
            cleanupOnFail: false
            disableHooks: false
            force: false
            maxHistory: 2
            recreate: false
            resetThenReuseValues: false
            resetValues: false
            reuseValues: false
            subNotes: false
            upgradeCRDs: false
          wait: false
          waitForJobs: false
        releaseName: ingress-nginx
        releaseNamespace: ingress-nginx
        repositoryName: ingress-nginx
        repositoryURL: oci://ghcr.io/k0rdent/catalog/charts
        values: |-
          controller:
            replicaCount: 2
      . . .
  status:
    dependencies: no dependencies
    featureSummaries:
    - featureID: Resources
      hash: TykvJFM9uvkodDA2BA+aMrr+sIMaVafd+QN1nFqy2Cc=
      lastAppliedTime: "2026-01-26T19:21:24Z"
      status: Provisioned
    - featureID: Helm
      hash: 3j/Tcmi/pB9bpdaoN5kPP/mJMOQ5eseUMRY6PTn+ayA=
      lastAppliedTime: "2026-01-26T19:21:24Z"
      status: Provisioned
    helmReleaseSummaries:
    - releaseName: ingress-nginx
      releaseNamespace: ingress-nginx
      status: Managing
      valuesHash: cy9BanPVkgV917PjkmARP80b8K2mW72dNWrk5Sv93NA=
  1. But the actual kube Deployment remains at 1 replica.
  2. So in summary:
ClusterDeployment: Values propagated successfully, generation increased from 2 to 3
ServiceSet: Values propagated successfully, generation increased from 2 to 3
Profile (sveltos): Values present in spec
Helm release (child cluster): Values propagated successfully, revision 2 shows replicaCount: 2
Actual Deployment: Values NOT applied, still showing 1 replica

This behavior was observed intermittently, we weren't able to reproduce this reliably every time. When it did occur, we could also see the following error in the addon-controller logs:

I0126 19:21:13.747027       1 clustersummary_controller.go:146] "Reconciling" controller="clustersummary" controllerGroup="config.projectsveltos.io" controllerKind="ClusterSummary" ClusterSummary="kcm-system/p--cdunkelb-standalone-m78rdg-capi-cdunkelb-standalone-m78rdg" namespace="kcm-system" name="p--cdunkelb-standalone-m78rdg-capi-cdunkelb-standalone-m78rdg" reconcileID="75eaa2bf-08c7-4522-bc04-4f5032c592bf"
I0126 19:21:13.747133       1 clustersummary_controller.go:342] "Reconciling ClusterSummary" controller="clustersummary" controllerGroup="config.projectsveltos.io" controllerKind="ClusterSummary" ClusterSummary="kcm-system/p--cdunkelb-standalone-m78rdg-capi-cdunkelb-standalone-m78rdg" namespace="kcm-system" name="p--cdunkelb-standalone-m78rdg-capi-cdunkelb-standalone-m78rdg" reconcileID="75eaa2bf-08c7-4522-bc04-4f5032c592bf"
E0126 19:21:14.145424       1 clustersummary_controller.go:422] "failed to deploy" err=<
	deploying resources failed: request is queued
	deploying helm charts failed: request is queued
 > controller="clustersummary" controllerGroup="config.projectsveltos.io" controllerKind="ClusterSummary" ClusterSummary="kcm-system/p--cdunkelb-standalone-m78rdg-capi-cdunkelb-standalone-m78rdg" namespace="kcm-system" name="p--cdunkelb-standalone-m78rdg-capi-cdunkelb-standalone-m78rdg" reconcileID="75eaa2bf-08c7-4522-bc04-4f5032c592bf"

So perhaps when we observed this behavior, the helm deployment failed for some reason and the actual replica remained at 1 but this error wasn't reflected in the ClusterSummary's status, which resulted in the confusion where we thought that the helm chart (with new values) was successfully provisioned when it wasn't?

System Information

CLUSTERAPI VERSION: v1.11.2
SVELTOS VERSION: v1.1.1
KUBERNETES VERSION: v1.33.4+k0s

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions