[BUG] Windows minion communication unreliable with multi-master failover #60010
Labels
Bug
broken, incorrect, or confusing behavior
P3
Priority 3
severity-medium
3rd level, incorrect or bad functionality, confusing and lacks a work around
Windows
Milestone
Tested Systems Info
Salt master v3003 running on Ubuntu 20.04.2 LTS (kernel 5.4.0-70-generic)
Salt minion v3003 running on Win10 Pro 20H2 (build 19042.870)
Description
The minion keeps attempting to switch master IP addresses even though the first master in the list does not have a communication problem. I think during these change attempts, the minion does not respond properly to requests from the master. Communication with the minion has intermittent and periodic failures.
When master_failback is false, the minion still switches back to the first master in the list.
Setup
A master with two IP address and a firewall to control which IP address the minion can communicate with. (only one IP is enabled at a time)
Steps to Reproduce the behavior
Run the master
Set the firewall to permit minion to connect to master IP address 1 and block IP address 2.
Run the minion
Start a ping test on the master
while sleep 2; do salt 'minion-name' test.ping; done
test.ping has intermittent failures with minion replying to master IP 1. Failures occur every 30 seconds when the alive interval triggers and the minion attempts to switch to master IP 2 (which is blocked).
Set the firewall to permit minion to connect to master IP address 2 and block IP address 1.
test.ping has intermittent failures with minion replying to master IP 2. Failures occur every 30 seconds when the alive interval triggers and the minion attempts to switch to master IP 1 (which is blocked).
Set the firewall to permit minion to connect to master IP address 1 and block IP address 2.
Even though master_failback is false, the minion switches back to the first master in the list (IP 1). test.ping continues to have intermittent failures with minion replying to master IP 1.
Expected behavior
In master_type: failover mode, the minion should continue to communicate with the first master in the list and should not attempt to change masters unless the first master has a communication problem. test.ping from the master should not fail every 30 seconds. The minion should failback to the first master only when master_failback is true.
intermittent and periodic master ping failures
minion log
Notes:
The text was updated successfully, but these errors were encountered: