Fast scaling recreates the cluster through Robin#11
Merged
albertompe merged 5 commits intodevelopfrom Sep 8, 2025
Merged
Conversation
miguelmdh
approved these changes
Sep 1, 2025
albertompe
added a commit
that referenced
this pull request
Feb 11, 2026
* Initial release of the Redis Cluster Operator with CRD update from the original repository * Formatting code, updating Makefile goals to check the code and update deprecated function calls * Add GitHub workflow for static code analysis and unit testing * Fix webhook build * Initial version of the release GH workflow * Add header Copyright and License info to all files containing code * Do not add innecesunnecessary log messages * update docs with status codes and status transitions * update docs with status codes and status transitions * fix: ubuntu runners version (#7) * Parallelize Ginkgo tests to reduce runtime by ~50% (#3) * Architecture redefinition and performance improvements (#5) * fix manifest goals to generate CRD manifest into config/crd/bases * camelcasing purgeKeysOnRebalance property in CRD v1 * add SubStatus to RedisCluster object, code refactoring and cleanup, makefile review * Makefile goal messages alignement * rename Monitoring component as Robin and update docs * rename Monitoring component as Robin and update docs * set 'redis-cluster-name' and 'redis.rediscluster.operator/component' labels to Robin deployment * Makefile cleanup and completion * perform a fast upgrade when purgeKeysOnBalance flag is active and no replicas configured * set to false only the conditions managed by the Operator when scaling down to 0 replicas * fmt * refactoring to remove overwritten parameter * moving to inditex.dev group from inditex.com * refactoring and add resources to Robin in sample manifest * refactoring and add .dockerignore file * refactoring CheckAndCreateK8sObjects function * remove Spec.Labels populating with RedisCluster labels * add support to replica updates when the cluster is being created and has not reached the Configuring status yet * Fast Upgrade refactorized to support multiple upgrades * refactoring * remove active waits for pods to be ready when upgrading a redis cluster * remove not valid config parameter from ConfigMap creation * update Go version to 1.24.4 * Robin integration * remove check cluster nodes integrity when in Ready status * remove cluster meet or cluster balance from Ready status reconciliation and refactoring * rename manager source files * fmt * robin module added to make API calls, update redis cluster initialization, cleanup and refactoring * check unsupported status * compare Robin configurations excluding status when reconciling * Remove double imports * add checks to Initializing status * add checks to Initializing status * add Configuring status reconciliation logic * add Configuring status reconciliation logic * clean Ready status reconciliation checks removing Redis operations * typo * update cluster nodes info in RedisCluster object * remove RedisCLI from CRD, code refactoring and cleanup * refactor Robin code! * refactoring * refactor finalizers checks * move status and substatus to constants * recode fast upgrade to use Robin endpoints * refactoring and cleanup * fix checkAndCreateStatefulSet function to not issue a false error when all pods are created * update cluster scalingup to use Robin * Scaling down integrated with Robin * Fast scaling and substatus renaming * Cluster fix Robin call implementation * Slow upgrade updated to use Robin * Remove outdated code and all references to redis go lib * fmt * Remove dead code and cleanup * Update to Go version 1.24.5 * Update cluster configuration when slow upgrading * Renaming RedisCluster as RedKeyCluster (#10) * Renaming CRD as RediKeyCluster, using rkcl shortname, and updating Makefile, config files, dockerfile and code * fmt * Rename manager go module * fmt * Rename code symbols, labels and log messages * Rename manager user agent name * Rename webhook * Rename docs * Rename internal code symbols * Rename internal code symbols * Update Go version to v1.24.6 * Add printcolumns to show master nodes, replicas per node, ephemeral flag and purgeKeysOnRebalance flag * Add additiona columns to kubectl get rkcl -o wide * Rename Robin endpoints * Delete redis references from Kubebuilder configuration * Remove Redis references from tests and docs * Fix Makefile! * Add info message to Makefile * Typo * Add RedKey Cluster description to readme * Rename RedisRobin as RedkeyRobin * Fast scaling recreates the cluster through Robin (#11) * Rename files * Rename robinRedis variables * Move Robin status to its own constants * Fast scaling resetting the cluster nodes list * No cluster fix needed * Update SECURITY.md (#12) * Check for open slots over reconciliations when upgrading and fixing upgrading logic (#13) * Typo * Reordering operations for cluster upgrade, including disabling Robin reconciliation * Add copyright * Fix image renaming when processing manifests files for debug profile * Renaming PersistRobinStatut function as PersistRobinStatus * Typo * Added SetAndPersistRobinStatus function * Extend upgrading requeuing time * Deploy sample RedKeyCluster for Robin debugging using make (#14) * Fixes and stabilization (#15) * Do not force fix operation on Robin and let him do the scalingUp * Build images using the defined Golang and Delve versions defined in Makefile * Add debug launcher to connect to delve * Add Robin /v1/cluster/status endpoint * Check Robin cluster status to know when the cluster scaling is finished * Check for cluster readiness using cluster status when configuring * Update the rapid upgrading to check the cluster status to determine when the operation is finished * Strengthening cluster upgrading to replicas updates (#16) * Check pods readiness when upgrading taking the StatefulSet replicas as the number of required pods, do not update RKCL replicas when scaling up the cluster before upgrading * fmt * Remove unused function * CRD validation added to block replicas from being modified if the cluster is not in Ready status * Update docs (#17) * Update docs * Update docs * Update Redis deployment section in TOC * Fix duplicate entry for Redis Cluster deployment * Fix capitalization of 'RedKey Robin' in TOC * Refine documentation for Ephemeral Mode Improved clarity and grammar in the documentation regarding ephemeral mode, including changes to the explanation and YAML snippet. * CRD and docs update, removing the webhook (#18) * Remove webhook * Update Redkey logo using k downcase * Replace RedKey with Redkey (k downcase) in docs * Replace RedKey with Redkey (k downcase) * Rename RedKey with Redkey (k downcase) * Use Primary nodes instead Master nodes * Update sed commands in Makefile * Robin status is updated when entering/leaving Maintenance mode * Move developer docs * Fix Robin developer guide link * Fix Service and StatefulSet overriding. purgeKeysOnRebalance and deletePvc are always shown when executing kubectl get rkcl * Renaming master with primary * Renaming master with primary, slave with replica, RedisCluster with RedkeyCluster and rdcl with rkcl * Replace rdcl with rkcl * fmt * Update CRD to specify Robin configuration as part of the structure instead as plain text yaml * Doc update * Remove redkeycluster_conversion.go file * Update CRD * Robin using primaries and decoupling Operator from Redis (#19) * Node field masterId renames as primaryId * Update robing debug manifest * Robin uses primaries instead masters * Update tests * Update rediscluster references * Remove slots from Robin response when asking for cluster nodes * fmt * Remove flags from Robin cluster nodes endpoint and add role * Changes in development guide and tests fixes (#20) * Add Manager Profiling * fix: Removing PDB if Zero and scaling up Robin to 1 when needed (#22) * feat: refactor test (#23) - Remove Controller because it is not needed in an E2E test. - Changing BeforeSuite and AfterSuite by their Synchronized pair, because they are sharing cluster. * Redkeyoperator timeout each test * refactor: e2e test, change names * update tests * Fix Redis Manager * Lowering the ErrorRequeueTimeout from 30 seconds to 5 seconds. (#25) * Prevent the use of purgeKeysOnRebalance on non-ephemeral clusters (#27) * Adding validation to prevent the use of purgeKeysOnRebalance on non-ephemeral clusters * Add copyright info * Fix Robin integration in E2E tests (#28) * feat: fix scale and PVC tests * fix: scale up and down tests, typos and remove phases in tests * feat: update robin configmap when scaling from 0 * fix: typo * Typo * Update logging message * Increasing put timeout (reset node requirement) * Reset node when upgrading moved to RollingConfig substatus and getting current partition simplified * Renaming remaining redis references to redkey --------- Co-authored-by: Alberto Martínez Pérez <albertompe@ext.inditex.com> * [OSOFFICE-88][OSS Release][Redkey Operator] Code checks (#26) * feat: Añadimos estandarizacion de los templates, dos workflows y ficheros informativos. No esta terminada la informacion aun * feat: añadir licencias * license * feat: licencia ccbysa4.0 * feat: add licence and copyright * feat: add licences * feat: reuse.toml * feat: notice license * feat: repolinter * feat: reuse --------- Co-authored-by: Alberto Martínez Pérez <albertompe@ext.inditex.com> * Handle scaling up from 0 primaries (#31) * Create Robin deployment with 0 replicas if the cluster has 0 primaries to avoid pod errors * Immediate requeue when scaling up Robin deployment * Remove unused variable * Add E2E test to check scaling from 0 primaries * Update tools versions (#32) * Update Dockerfile to use debian trixie base image * Update Go version to v1.25.6, Controller Tools to v0.20.0, Operator SDK to v1.42.0 and Kustomize to v5.8.0 * Move go config to .tool-versions file * Add copyright info * fix: .tool-version file. (#33) * Use Robin internal cluster status (#34) * Use Robin internal cluster status * Complete logging info * Use PartialPodTemplateSpec in RobinSpec instead of corev1.PodTemplateSpec (#36) * Use PartialPodTemplateSpec in RobinSpec instead of corev1.PodTemplateSpec * fmt * Refactoring slow upgrade and renaming substates (#38) * Fix robin debug manifests * Fix logging message * Reduce log verbosity when moving slots * Refactoring to suit the Robin regression fix. Substatuses renaming * Removing unnecessary checks when slow upgrading * Renaming substatus * Update docs to suit substatus renaming * Prepare release (#39) * Update README.md file adding quich start guide, project detailed info and reorganizing * Update license header * Configure coherent controller-gen version for the operator sdk version used * Add PROJECT file required by operator sdk to build the bundle * Add base CSV file * Fix logo size * Add Operator icon to CSV base file * Update version and set the CRD annotation version from make * Update the operator-version when generating the manifests to update bundle and deployment manifests generated * Update logging message * Trimming file * Commit generated CRD * Remove image set by kustomize to not interfere with other goals * Update Makefile to build and push the catalog * clean goal removes .local directory * Updating development guide * Rename variables * Update Go version to 1.25.7 * Use profile as IMG version * Use version to tag IMG for coherence with the other release goals * release.yml workflow updated to build and push operator, bundle and catalog images * Update maintainers email address in CSV base file * Add copyright info * Update version in Quick Start guide * Reference repo releases to get the available versions * Fixing image name in release workflow * Update operator info in base csv file * Adjust channels in Makefile * Update CSV base adding operator info * Update CSV base -> capability level set to Auto Pilot * Fix copyright headers * Check bundle generation using IMAGE_TAG_BASE * Use redkey.inditex.dev group * Add required copyright headers to generated files * Fix test-e2e-cov goal config * Add job to install CRD before launching e2e tests in the wf * Use the public Robin image * Update E2E tests workflow * Update E2E tests workflow * Update E2E tests workflow * Update E2E tests workflow * Parallel E2E Tests factor: 2 * Parallel E2E Tests factor 1 and increased timeout in primary/replica tests * Reduce pods in primary/replica tests * Do not execute primary/replica e2e tests till the repo is configured as public because of reduced runners resources
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fast scaling uses the new endpoint exposed by Robin to recreate the cluster.