Skip to content

Rollout is stuck when leader election is enabled #1619

@liu-cong

Description

@liu-cong

What happened:

When leader election is enabled with 3 EPP replicas, updating EPP deployment gets stuck. This is because when leader election is enabled, there is at most 1 ready replica at any given time. Killing the old leader to give the lease to a new leader would violate the default maxUnavailable=25% in k8s deployment rolling update.

What you expected to happen:

We would need the Recreate update policy to allow the old leader to be killed and let the new leader take over.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version):
  • Inference extension version (use git describe --tags --dirty --always):
  • Cloud provider or hardware configuration:
  • Install tools:
  • Others:

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions