Skip to content

Restart vmagent shards concurrently if replicaCount > 1 #1248

Open
@xiaozongyang

Description

@xiaozongyang

Backgroud

Now we have 12 vmagent shards and there are 2 replicas of each shard.
When we apply changes to the crd VMAgent, the operator restarts the pods of VMAgent one by one. For each pod it will consume about 2 minutes to trigger the pod restart and wait it ready. I'll spend about 50 minutes to watch the operation process. So I'm wondering that we could modify the restart process to speed up the operation process.

What I want

Since we have 2 replicas for each shard, we could restart one replica of every shard. The full steps will be

  1. we restart replica 0 of every shard
  2. wait all replica 0 get ready
  3. restart replica 1 of every shard
  4. wait all replicas 1 get ready

If we change the deloy process to this way, the upgrading process should complete more quickly.

Futhermore, we can upgrade restart every shard concurrently if shardCount won't change.

cc @f41gh7

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions