Skip to content

Upgrade fails when per-pod PVCs remain Pending (block-node StatefulSet) — upgrade/installation rollback #381

@mahmoudian1

Description

@mahmoudian1

Bill of Materials, Application, or CLI Version

sudo solo-provisioner block node install --profile=local --chart-version=0.26.2 --values=/path/to/values-override.yaml

Describe the bug

When running solo-provisioner to install/upgrade the block node (chart v0.26.2), the StatefulSet creates per-pod PVCs (e.g. live-storage-block-node-block-node-server-0) that remain Pending because no matching PVs exist. The Pending PVCs prevent pod scheduling, Helm times out waiting for the pod to become Ready, and because Helm runs with atomic=true the release is rolled back and the upgrade fails.

This issue groups the relevant troubleshooting logs and requests clearer documentation and/or a migration approach so upgrades/installations do not fail in this way.

Describe the expected behavior

This issue shouldn't happen and the dependent resources should be created. The installer/upgrade should detect missing per-pod PVs and emit a clear, actionable error pointing to the Pending per-pod PVC names and recommended remediation steps.

To Reproduce

Prepare a host machine and run:

sudo solo-provisioner block node install --profile=local --chart-version=0.26.2 --values=/path/to/values-override.yaml
(Or run upgrade similarly:)
sudo solo-provisioner block node upgrade --profile=local --chart-version=0.26.2 --with-reset --values=values-override.yaml
Watch the block-node namespace resources:
kubectl -n block-node get pods
kubectl -n block-node get pvc
kubectl -n block-node get pv

Observed behavior (logs / excerpts)
Pod pending due to unbound PVCs:

weaver@bn-lfh01-previewnet:~$ kubectl -n block-node get po
NAME                             READY   STATUS    RESTARTS   AGE
block-node-block-node-server-0   0/1     Pending   0          11s
Pod describe shows Pending + unbound PVCs:
Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  22s   default-scheduler  0/1 nodes are available: pod has unbound immediate PersistentVolumeClaims.
PVCs show Pending (per-pod PVCs), while other PV/PVC entries show some PVs already Bound:
kubectl -n block-node get pvc
NAME                                                  STATUS    VOLUME                    CAPACITY
archive-storage-block-node-block-node-server-0        Pending
archive-storage-pvc                                   Bound     archive-storage-pv        5Gi
live-storage-block-node-block-node-server-0           Pending
live-storage-pvc                                      Bound     live-storage-pv           5Gi
logging-storage-block-node-block-node-server-0        Pending
logging-storage-pvc                                   Bound     logging-storage-pv        5Gi
verification-storage-block-node-block-node-server-0   Pending
verification-storage-pvc                              Bound     verification-storage-pv   5Gi

Solo-provisioner / Helm timed out waiting and rolled back due to atomic=true:

2026-02-25T21:12:44Z DBG Patch StatefulSet "block-node-block-node-server" ...
2026-02-25T21:12:44Z DBG error updating the resource "...": cannot patch "block-node-block-node-server" with kind StatefulSet: StatefulSet.apps "block-node-block-node-server" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'ordinals', 'template', 'updateStrategy', 'revisionHistoryLimit', 'persistentVolumeClaimRetentionPolicy' and 'minReadySeconds' are forbidden
...
2026-02-25T21:12:45Z ERR Helm upgrade failed error="helm.upgrade_failed: failed to run upgrade action, cause: release block-node failed, and has been rolled back due to atomic being set: cannot patch "block-node-block-node-server" ...

Example values override used:

cat values-override.yaml
blockNode:
   config:
    BLOCK_NODE_EARLIEST_MANAGED_BLOCK: "270691"
    BACKFILL_START_BLOCK: "270691"

Additional Context

No response

Metadata

Metadata

Assignees

Labels

BugA error that causes the feature to behave differently than what was expected based on design docs

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions