Skip to content

Comments

Fix flaky cluster automatic failover test#3206

Merged
enjoy-binbin merged 1 commit intovalkey-io:unstablefrom
enjoy-binbin:fix_test
Feb 17, 2026
Merged

Fix flaky cluster automatic failover test#3206
enjoy-binbin merged 1 commit intovalkey-io:unstablefrom
enjoy-binbin:fix_test

Conversation

@enjoy-binbin
Copy link
Member

The test somehow is slow and due to the short cluster-node-timeout, an automatic
failover may fail to trigger due to cluster-replica-validity-factor:

*** [err]: Automatic failover vote is not limited by two times the node timeout - mixed failover in tests/unit/cluster/manual-failover.tcl
The third failover does not happen

xxx # Cluster state changed: fail
xxx # Cluster is currently down: At least one hash slot is not served by any available node. Please check the 'cluster-require-full-coverage' configuration.

Closes #3203.

The test somehow is slow and due to the short cluster-node-timeout, an automatic
failover may fail to trigger due to cluster-replica-validity-factor:
```
*** [err]: Automatic failover vote is not limited by two times the node timeout - mixed failover in tests/unit/cluster/manual-failover.tcl
The third failover does not happen

xxx # Cluster state changed: fail
xxx # Cluster is currently down: At least one hash slot is not served by any available node. Please check the 'cluster-require-full-coverage' configuration.
```

Closes valkey-io#3203.

Signed-off-by: Binbin <binloveplay1314@qq.com>
@zuiderkwast
Copy link
Contributor

Let's backport it. Not sure which versions are affected. All?

Copy link
Contributor

@sarthakaggarwal97 sarthakaggarwal97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @enjoy-binbin

@enjoy-binbin enjoy-binbin moved this to To be backported in Valkey 8.1 Feb 17, 2026
@enjoy-binbin enjoy-binbin moved this to To be backported in Valkey 9.0 Feb 17, 2026
@enjoy-binbin
Copy link
Member Author

We added it in 8.1 i think.

@enjoy-binbin enjoy-binbin merged commit 556180d into valkey-io:unstable Feb 17, 2026
22 of 23 checks passed
@enjoy-binbin enjoy-binbin deleted the fix_test branch February 17, 2026 01:56
@roshkhatri roshkhatri moved this from To be backported to 8.1.6 WIP in Valkey 8.1 Feb 17, 2026
roshkhatri pushed a commit to roshkhatri/valkey that referenced this pull request Feb 17, 2026
The test somehow is slow and due to the short cluster-node-timeout, an
automatic
failover may fail to trigger due to cluster-replica-validity-factor:
```
*** [err]: Automatic failover vote is not limited by two times the node timeout - mixed failover in tests/unit/cluster/manual-failover.tcl
The third failover does not happen

xxx # Cluster state changed: fail
xxx # Cluster is currently down: At least one hash slot is not served by any available node. Please check the 'cluster-require-full-coverage' configuration.
```

Closes valkey-io#3203.

Signed-off-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
@roshkhatri roshkhatri moved this from To be backported to 9.0.3 in Valkey 9.0 Feb 17, 2026
roshkhatri pushed a commit to roshkhatri/valkey that referenced this pull request Feb 17, 2026
The test somehow is slow and due to the short cluster-node-timeout, an
automatic
failover may fail to trigger due to cluster-replica-validity-factor:
```
*** [err]: Automatic failover vote is not limited by two times the node timeout - mixed failover in tests/unit/cluster/manual-failover.tcl
The third failover does not happen

xxx # Cluster state changed: fail
xxx # Cluster is currently down: At least one hash slot is not served by any available node. Please check the 'cluster-require-full-coverage' configuration.
```

Closes valkey-io#3203.

Signed-off-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
harrylin98 pushed a commit to harrylin98/valkey_forked that referenced this pull request Feb 19, 2026
The test somehow is slow and due to the short cluster-node-timeout, an
automatic
failover may fail to trigger due to cluster-replica-validity-factor:
```
*** [err]: Automatic failover vote is not limited by two times the node timeout - mixed failover in tests/unit/cluster/manual-failover.tcl
The third failover does not happen

xxx # Cluster state changed: fail
xxx # Cluster is currently down: At least one hash slot is not served by any available node. Please check the 'cluster-require-full-coverage' configuration.
```

Closes valkey-io#3203.

Signed-off-by: Binbin <binloveplay1314@qq.com>
roshkhatri pushed a commit to roshkhatri/valkey that referenced this pull request Feb 20, 2026
The test somehow is slow and due to the short cluster-node-timeout, an
automatic
failover may fail to trigger due to cluster-replica-validity-factor:
```
*** [err]: Automatic failover vote is not limited by two times the node timeout - mixed failover in tests/unit/cluster/manual-failover.tcl
The third failover does not happen

xxx # Cluster state changed: fail
xxx # Cluster is currently down: At least one hash slot is not served by any available node. Please check the 'cluster-require-full-coverage' configuration.
```

Closes valkey-io#3203.

Signed-off-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
madolson pushed a commit that referenced this pull request Feb 24, 2026
The test somehow is slow and due to the short cluster-node-timeout, an
automatic
failover may fail to trigger due to cluster-replica-validity-factor:
```
*** [err]: Automatic failover vote is not limited by two times the node timeout - mixed failover in tests/unit/cluster/manual-failover.tcl
The third failover does not happen

xxx # Cluster state changed: fail
xxx # Cluster is currently down: At least one hash slot is not served by any available node. Please check the 'cluster-require-full-coverage' configuration.
```

Closes #3203.

Signed-off-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
madolson pushed a commit that referenced this pull request Feb 24, 2026
The test somehow is slow and due to the short cluster-node-timeout, an
automatic
failover may fail to trigger due to cluster-replica-validity-factor:
```
*** [err]: Automatic failover vote is not limited by two times the node timeout - mixed failover in tests/unit/cluster/manual-failover.tcl
The third failover does not happen

xxx # Cluster state changed: fail
xxx # Cluster is currently down: At least one hash slot is not served by any available node. Please check the 'cluster-require-full-coverage' configuration.
```

Closes #3203.

Signed-off-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: 8.1.6 WIP
Status: 9.0.3 (WIP)

Development

Successfully merging this pull request may close these issues.

[TEST-FAILURE] Test failure in unit/cluster/manual-failover.tcl

5 participants