-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Description
Overview of the Issue
I’ve upgraded a Consul server from 1.14.4 → 1.14.11 without issue, but when I then attempt to upgrade to Consul 1.15.10 from 1.14.11, I get the following error message on startup: refusing to rejoin cluster because server has been offline for more than the configured server_rejoin_age_max (168h0m0s) - consider wiping your data dir.
The server has been offline for only a fraction of a second during the systemctl restart consul command, so I’m not sure what the message is referring to here.
Posting this question on discuss.hashicorp.com hasn't resulted in any obvious answer, so perhaps this is a bug?
Reproduction Steps
- Create a cluster with three server nodes (this is what I have, but might not matter at all).
- Make sure each server is running Consul 1.14.11.
- Change the Consul binary of one of the non-leader servers to version 1.15.10.
- Restart Consul.
- Check the logs for the error message above.
Consul info for Server
Server info
Output from server 'consul info' command here:
agent:
check_monitors = 0
check_ttls = 0
checks = 0
services = 0
build:
prerelease =
revision = c0c5688c
version = 1.14.11
version_metadata =
consul:
acl = disabled
bootstrap = false
known_datacenters = 1
leader = false
leader_addr = 192.168.40.23:8300
server = true
raft:
applied_index = 167628074
commit_index = 167628074
fsm_pending = 0
last_contact = 30.184466ms
last_log_index = 167628074
last_log_term = 13469
last_snapshot_index = 167622202
last_snapshot_term = 13469
latest_configuration = [{Suffrage:Voter ID:11659f41-183a-8ed2-ed11-5be8e2044ea4 Address:192.168.40.22:8300} {Suffrage:Voter ID:3df4f76a-9bae-f14a-785e-da0903cb5241 Address:192.168.40.23:8300} {Suffrage:Voter ID:ab60d46b-23fc-7fe4-4c34-5677356857b5 Address:192.168.40.21:8300}]
latest_configuration_index = 0
num_peers = 2
protocol_version = 3
protocol_version_max = 3
protocol_version_min = 0
snapshot_version_max = 1
snapshot_version_min = 0
state = Follower
term = 13469
runtime:
arch = amd64
cpu_count = 2
goroutines = 261
max_procs = 2
os = linux
version = go1.20.10
serf_lan:
coordinate_resets = 0
encrypted = true
event_queue = 0
event_time = 108
failed = 0
health_score = 0
intent_queue = 0
left = 0
member_time = 36413
members = 27
query_queue = 0
query_time = 1
serf_wan:
coordinate_resets = 0
encrypted = true
event_queue = 0
event_time = 1
failed = 0
health_score = 0
intent_queue = 0
left = 0
member_time = 2273
members = 3
query_queue = 0
query_time = 1
HCL config file from server:
{
"node_name": "consul01",
"bind_addr": "192.168.40.21",
"client_addr": "127.0.0.1 192.168.40.21",
"datacenter": "XXXXXXX",
"server": true,
"bootstrap_expect": 3,
"data_dir": "/var/db/consul",
"encrypt": "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
"retry_join": [
"192.168.40.22",
"192.168.40.23"
],
"tls": {
"defaults": {
"key_file": "/etc/consul.d/tls/XXXXXXX-server-consul-0-key.pem",
"cert_file": "/etc/consul.d/tls/XXXXXXX-server-consul-0.pem",
"ca_file": "/etc/consul.d/tls/consul-agent-ca.pem",
"verify_incoming": true,
"verify_outgoing": true
},
"internal_rpc": {
"verify_server_hostname": true
}
},
"auto_encrypt": {
"allow_tls": true
},
"ports": {
"https": 8501
},
"peering": {
"enabled": false
},
"connect": {
"enabled": false
}
}
Operating system and Environment details
Consul servers are running on Red Hat Enterprise Linux Server release 7.6 (Maipo).