-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Description
In some cases, when upgrading a Redkey cluster, it gets stucked when resharding to empty a node before rollig update.
This erroneous behavior usually occurs when upgrading a cluster with a high number of nodes.
When launching the resharding command in Robin to empty the node, Robin gets stuck in cluster status Resharding Error, so the Operator cannot continue with the operation.
Steps to Reproduce
Deploy the Operator and the sample Redkey cluster:
make manifests
make install
make deploy
make apply-rkcl
Edit the property purgeKeysOnRebalance to set the value to false.
Scale the cluster to 15 primaries.
Force an upgrade editting the rkcl object and changing spec.config.maxmemory-samples to 6 (or any other value / addign a comment).
Force the upgrade again by making configuration changes until the problem is reproduced (it does not occur in all cases).
The log should show Robin status error:
2026-01-25T08:41:43Z INFO controllers.RedkeyCluster RedkeyCluster reconciler called {"redis-cluster": {"name":"redis-cluster-ephemeral","namespace":"redkey-operator"}, "name": "redis-cluster-ephemeral", "ns": "redkey-operator"}
2026-01-25T08:41:43Z INFO controllers.RedkeyCluster Found RedkeyCluster {"redis-cluster": {"name":"redis-cluster-ephemeral","namespace":"redkey-operator"}, "name": "redis-cluster-ephemeral", "GVK": "redis.inditex.dev/v1, Kind=RedkeyCluster", "status": "Upgrading"}
2026-01-25T08:41:43Z INFO controllers.RedkeyCluster RedkeyCluster reconciler start {"redkey-cluster": {"name":"redis-cluster-ephemeral","namespace":"redkey-operator"}, "status": "Upgrading"}
2026-01-25T08:41:43Z INFO controllers.RedkeyCluster PodDisruptionBudget not deployed {"redkey-cluster": {"name":"redis-cluster-ephemeral","namespace":"redkey-operator"}, "PodDisruptionBudget Name": "redis-cluster-ephemeral-pdb"}
2026-01-25T08:41:43Z INFO controllers.RedkeyCluster Delete PVCs feature disabled in cluster spec or not specified. PVCs won't be deleted after scaling down or cluster deletion {"redkey-cluster": {"name":"redis-cluster-ephemeral","namespace":"redkey-operator"}}
2026-01-25T08:41:43Z INFO controllers.RedkeyCluster Redis node pods are ready {"redkey-cluster": {"name":"redis-cluster-ephemeral","namespace":"redkey-operator"}, "pods": 16}
2026-01-25T08:41:43Z INFO controllers.RedkeyCluster Waiting for cluster to be Ready in Robin {"redkey-cluster": {"name":"redis-cluster-ephemeral","namespace":"redkey-operator"}, "currentStatus": "ReshardingError"}
2026-01-25T08:41:43Z INFO controllers.RedkeyCluster RedkeyCluster reconciler end {"redkey-cluster": {"name":"redis-cluster-ephemeral","namespace":"redkey-operator"}, "status": "Upgrading"}
Expected Behavior
The upgrade must be completed leaving all nodes updated and without data loss.
Version / Environment
No response
Additional context or logs
No response