-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Description
When upgrading a Redkey cluster the operation gets stucked when resharding a node before rolling update.
If a slot stays in importing or migrating state, the stabilizing mechanism is not launched.
Steps to Reproduce
Deploy the Operator and the sample Redkey cluster:
make manifests
make install
make deploy
make apply-rkcl
Edit the property purgeKeysOnRebalance to set the value to false.
Scale the cluster to 15 primaries.
Force an upgrade editting the rkcl object and changing spec.config.maxmemory-samples to 6 (or any other value / addign a comment).
Force the upgrade again by making configuration changes until the problem is reproduced (it does not occur in all cases).
The log should show Robin status error:
2026-01-25T08:41:43Z INFO controllers.RedkeyCluster RedkeyCluster reconciler called {"redis-cluster": {"name":"redis-cluster-ephemeral","namespace":"redkey-operator"}, "name": "redis-cluster-ephemeral", "ns": "redkey-operator"}
2026-01-25T08:41:43Z INFO controllers.RedkeyCluster Found RedkeyCluster {"redis-cluster": {"name":"redis-cluster-ephemeral","namespace":"redkey-operator"}, "name": "redis-cluster-ephemeral", "GVK": "redis.inditex.dev/v1, Kind=RedkeyCluster", "status": "Upgrading"}
2026-01-25T08:41:43Z INFO controllers.RedkeyCluster RedkeyCluster reconciler start {"redkey-cluster": {"name":"redis-cluster-ephemeral","namespace":"redkey-operator"}, "status": "Upgrading"}
2026-01-25T08:41:43Z INFO controllers.RedkeyCluster PodDisruptionBudget not deployed {"redkey-cluster": {"name":"redis-cluster-ephemeral","namespace":"redkey-operator"}, "PodDisruptionBudget Name": "redis-cluster-ephemeral-pdb"}
2026-01-25T08:41:43Z INFO controllers.RedkeyCluster Delete PVCs feature disabled in cluster spec or not specified. PVCs won't be deleted after scaling down or cluster deletion {"redkey-cluster": {"name":"redis-cluster-ephemeral","namespace":"redkey-operator"}}
2026-01-25T08:41:43Z INFO controllers.RedkeyCluster Redis node pods are ready {"redkey-cluster": {"name":"redis-cluster-ephemeral","namespace":"redkey-operator"}, "pods": 16}
2026-01-25T08:41:43Z INFO controllers.RedkeyCluster Waiting for cluster to be Ready in Robin {"redkey-cluster": {"name":"redis-cluster-ephemeral","namespace":"redkey-operator"}, "currentStatus": "ReshardingError"}
2026-01-25T08:41:43Z INFO controllers.RedkeyCluster RedkeyCluster reconciler end {"redkey-cluster": {"name":"redis-cluster-ephemeral","namespace":"redkey-operator"}, "status": "Upgrading"}
Expected Behavior
The upgrade must be completed leaving all nodes updated and without data loss.
If an slot stays opened, the stabilize mechanism must be launched to solve the problem.
Version / Environment
No response
Additional context or logs
No response